Through the Looking-Glass at Netflix

By Jon Engelsman

December 30, 2018

A look at some of the back-end infrastructure used to deliver Netflix's interactive title, Bandersnatch.


Bandersnatch

Bandersnatch

Black Mirror: Bandersnatch is an interactive, choose-your-own-adventure show that follows the mind-bending drama of a video game programmer in the 1980’s. Netflix recently released this new episode of the tech dystopia series Black Mirror after spending a significant amount of time and energy building out new “state tracking” technology to incorporate interactivity into the experience.

In this post, I take a look at the underlying data structures and code that Netflix developed for this experience, exploring network calls and large JSON data structures to try and make sense of their madness.

Choose-Your-Own-Adventure

Bandersnatch is a choose-your-own-adventure show about someone trying to make a choose-your-own-adventure video game based on a choose-your-own-adventure book. But the trick with choose-your-own-adventure stories is that they don’t follow linear storylines like traditional narratives, instead offering multiple story paths with different endings, all depending on the choices you make along the way. While non-linear storytelling has been part and parcel in the video game industry for decades, its implementation in a major streaming media platform is notable.

While watching Bandersnatch on Netflix, the viewer is presented with choices that pop up on the screen. Selecting one choice over another changes the video that you see as a new story segment unfolds.

A Choice in Bandersnatch Netflix

A number of industrious viewers on Reddit have attempted to map out the choices, branching storylines and endings for the Bandersnatch story, an example of which is shown below in a flowchart developed by Reddit user AppiusClaudius.

Narrative Flowchart of Bandersnatch Reddit/AppiusClaudius

The overall effect of providing choices and delivering a seamless video experience is impressive. So how did Netflix do it?

Interactive Entertainment at Netflix

Bandersnatch isn’t Netflix’s first foray into interactive media, it just happens to be the first one aimed at an adult audience. Last year, they premiered their first attempt at interactive media with a kid’s show episode of Puss in Boots: Trapped in an Epic Tale. And according to a Netflix help page, there are currently 5 interactive content titles with more on the way.

But it’s probably fair to say that Bandersnatch is Netflix’s most ambitious interactive title to date. So much so that they had to create new “state tracking” technology to handle the “millions of permutations of how you can play this story” according to Carla Engelbrecht, Netflix’s director of product innovation.

So what is this “state tracking” technology exactly and how does it work? To figure that out, I did what I always do when I’m curious about how websites do amazing things. I started digging into source code and network calls!

Network Calls

While watching Bandersnatch, I opened up the Chrome Developer Tools and just started watching all of the network calls. In particular, I was looking for anything that might have to do with interactivity or state-tracking functionality. Monitoring network calls while watching and making choices during the show, it looks like there are about 5 different request URL’s that are regularly being called.

Shakti Netflix (l)
nflxvideo.net

The heavy lifter of content delivery are these GET requests to a nflxvideo.net endpoint, returning back application/octet-stream binaries of content. They are quickly and repeatedly returning back media data to make sure the viewer never experiences any buffering of video.

`https://ipv4-c330-nyc001-ix.1.oca.nflxvideo.net/range/2727743-2752470?o=AQFn7d5Kh9LAp4jb6mieUi1lPH5kaAzEkOl_fiWm5Nwl9KZnnKTMLCgs0wPM4K8SZt3gZHeUzupAIiSM6UEicfursSFWTuhGvv8-ifgf_JxsrhoKVLOv_uF9PyDUsVDtbX7w9tYwR0WXIGRpo4Z3y4clZfW8J3Cu0zFlO2IzhkLUD5ovVgh7r5XMYvCZ9rkucx06MFp0SHOvTy3uk-QcjQ&v=3&e=1546178203&t=qyg5yHh9iczoSr_tvN3prltdmRc&sc=Eq%1F3%25Uc%03k%0Et%7DV%5DRh%0Cf%05Bh7%1F%5Ey%7C%05CY%5B` 

For the most part, these calls looks pretty standard to most Netflix content streams, so they likely don’t show anything unique that might help us figure out the interactivity.

Personalization

Another set of network calls look a bit more promising, the Personalization calls. These POST requests submit a data-binary payload to a /personalization/cl2 endpoint at the primary www.netflix.com domain.

`https://www.netflix.com/personalization/cl2`

The data-binary payload consists of a JSON data structure which includes a lot of expected tracking information (user ID, session ID, etc). But it also appears to include a ‘state’ object which might tell us something useful about interactivity.

MSL and Cadmium:

There are two network calls that send POST requests to a Netflix Message Security Layer (MSL) API.

Events:

`https://www.netflix.com/nq/msl_v1/cadmium/pbo_events/%5E1.0.0/router`

Log Blobs:

`https://www.netflix.com/nq/msl_v1/cadmium/pbo_logblobs/%5E1.0.0/router`

This MSL API is used to “transport data between two or more communicating entities”. These network calls appear to be sending log/event data related to Cadmium, Netflix’s homegrown JavaScript video player.

The data payloads of these calls seem to include token and signature information about the user, but it’s not clear what is in the data binaries being sent. However, seeing as similar call are regularly sent during streaming of non-interactive media, it’s likely that these calls aren’t involved in any interactive functionality.

I should note that I’m glossing over a lot of detail in these network requests, specifically in regards to request/response headers, full data payloads, etc. I’m just pulling out enough detail to try and get a gist of what’s happening, but I encourage you to explore some of them on your own.

Shakti

Last but not least on our list of network calls are a set of POST requests to an Netflix API named Shakti.

`https://www.netflix.com/api/shakti/vdf992f93/pathEvaluator?drmSystem=widevine&isWatchlistEnabled=false&isShortformEnabled=false&isVolatileBillboardsEnabled=false&method=call&falcor_server=0.1.0&withSize=true&materialize=true`

This network request is the one we’re interested in here, so let’s explore it in a bit more detail.

More On Shakti

From the little information available publicly, it seems that the Shakti API is another data fetching service. The name “Shakti” is mentioned briefly in a 2014 slide from Ryan Anklam, then Senior UI Engineer at Netflix, discussing new Node.js services at the company.

Shakti Ryan Anklam

It’s also mentioned in a github repo of the same name from user HowardStark, however it appears to be more of an attempt at reverse engineering from the outside-in, rather than any official Netflix description of the service.

Ok, so we have somewhat of an idea what the Shakti API might be, but what else is going here? Well, looking back at the Shakti API calls, there appears to be a query parameter named falcor_server in the request URL’s. Seeing as Falcor is Netflix’s custom JavaScript library for “efficient data fetching”, the idea that Shakti is a data fetching API seems to check out.

Shakti Call Paths

There’s one more important thing to note about the Shakti API network calls. While their URL’s look similar, there appear to be at least 8 distinct different types of Shakti API calls, distinguishable by their data-binaries. More specifically, by the values for a key named callPath in the data-binary JSON payload. Based on a cursory reading of the Falcor documentation, this payload is likely used by Falcor where the values for callPath route to different internal services.

I’m going to skip over the first 5 Shakti callPaths, but they’re listed below for reference.

  • Reno / Lolomo: "callPath":["reno","newLolomo"]

  • Preplay(s):

    "paths":[["videos",80988062,"bookmarkPosition"],["videos",80988062,"preplay",-1,"experience"],["videos",80988062,"preplay",-1,"playbackVideos",{"from":0,"to":1},["availability","bookmarkPosition","creditsOffset","current","requiresPreReleasePin","runtime","summary"]]]
    
  • Postplay: "callPath":["videos",80988062,"postplay"]

  • Advisories: "callPath":["videos",80988062,"advisories"]

  • Dynamic Messages: "callPath":["dynamicMessages"]

But wait, what’s this! At least 3 of the 8 callPath’s used in Shakti requests have the word “interactive” in them.

  • logInteractivePlaybackImpression: "callPath":["logInteractivePlaybackImpression"]

  • logInteractiveStateSnapshots: "callPath":["logInteractiveStateSnapshots"]

  • interactiveVideoMoments: "callPath":["videos",80988062,"interactiveVideoMoments"]

This looks promising. Not only are we seeing the word “interactive” but there’s also a “state” in there as well. We’ll come back to the two “log” callPath ’s in a bit, but for now let’s check out the response of the interactiveVideoMoments callPath.

Interactive Video Moments

The interactiveVideoMoments callPath is 1 of 6 Shakti API requests that are called on page load, and it appears to only be called once. So it’s reasonable to assume this is some sort of initialization call.

But the most noticeable thing about the response data for the interactiveVideoMoments callPath is that it’s big. Really big. Prettifying the JSON structure, it’s over 25,000 lines long.

In copying the response data below, I’ve cut out most of the actual content in order to show the structure of the JSON Graph.

{
  "jsonGraph": {
    "videos": {
      "80988062": {
        "interactiveVideoMoments": {
          "$type": "atom",
          "$size": 273073,
          "value": {
            "type": "bandersnatch",
            "choicePointNavigatorMetadata": {
              "config": {
              },
              "storylines": {
              },
              "type": "snapshots",
              "choicePointsMetadata": {
                "timelineLabel": [],
                "choicePoints": {...},
                "choices": null
              }
            },
            "commonMetadata": {
              "layouts": {...},
              "settings": {...}
            },
            "segmentHistory": [],
            "stateHistory": {...},
            "snapshots": [],
            "momentsBySegment": {...},
            "preconditions": {...},
            "audioLocale": "en",
            "segmentGroups": {...}
          }
        }
      }
    }
  },
  "paths": [
    [
      "videos",
      "80988062",
      "interactiveVideoMoments"
    ]
  ]
}

Within this JSON Graph, there seem to be at least 4 components that define the interactive components of Bandersnatch:

  • **stateHistory: ** initialization of 62 state variables (59 boolean and 3 multivariate)

  • **momentsBySegment: ** a list of video segments by type (scenes, impressions, post plays, etc) describing state preconditions, new state data and choices (also defined by segments).

  • **preconditions: ** a list of precondition definitions for each segment, defined by simple to complex conditional logic using the state variables

  • **segmentGroups: ** a list of how segment group ID’s and the segments that make them up, including precondition requirements

Let’s look at these some of components in more detail from the perspective of the first choice in Bandersnatch, where Stefan’s dad asks if he wants Sugar Puffs or Frosties cereal.

A Choice Netflix

State History

Let’s start with the stateHistory list. At face value, this list of state parameters doesn’t tell us much. But we know it’s a state initialization of some kind, and since we know that Netflix created new “state tracking” technology for this interactive feature, I’m guessing it’s an important component! We’ll come back to this.

"stateHistory": {
       "p_sp": true,
       "p_tt": true,
       "p_8a": false,
       "p_td": true,
       "p_cs": false,
       "p_w1": false,
       "p_2b": false,
       "p_3j": false,
       "p_pt": false,
       "p_cd": false,
       "p_cj": false,
       "p_sj": false,
       "p_sj2": false,
       "p_tud": false,
       "p_lsd": false,
       "p_vh": false,
       "p_3l": false,
       "p_3s": false,
       "p_3z": false,
       "p_ps": "n",
       "p_wb": false,
       "p_kd": false,
       "p_bo": false,
       "p_5v": false,
       "p_pc": "n",
       "p_sc": false,
       "p_ty": false,
       "p_cm": false,
       "p_pr": false,
       "p_3ad": false,
       "p_s3af": false,
       "p_nf": false,
       "p_np": false,
       "p_ne": false,
       "p_pp": false,
       "p_tp": false,
       "p_bup": false,
       "p_be": false,
       "p_pe": false,
       "p_pae": false,
       "p_te": false,
       "p_snt": false,
       "p_8j": false,
       "p_8d": false,
       "p_8m": false,
       "p_8q": false,
       "p_8s": false,
       "p_8v": false,
       "p_vs": "n",
       "p_scs": false,
       "p_3ab": false,
       "p_3ac": false,
       "p_3aj": false,
       "p_3ah": false,
       "p_3ak": false,
       "p_3al": false,
       "p_3af": false,
       "p_5h": false,
       "p_5ac": false,
       "p_5ag": false,
       "p_5ad": false,
       "p_6c": false
}

Moments By Segment

Next, scanning through the momentsBySegment list we come across a segment 1A that seems related to this first choice in the show, the one between Sugar Puffs (choice 1E) and Frosties (choice 1D).

"1A": [
  {
    "type": "scene:cs_bs",
    "startMs": 135880,
    "endMs": 153520,
    "activationWindow": [
      135880,
      149520
    ],
    "id": "1A",
    "layoutType": "l2",
    "uiDisplayMS": 139520,
    "uiHideMS": 149520,
    "defaultChoiceIndex": 0,
    "choiceActivationThresholdMS": 135880,
    "choices": [
      {
        "id": "1E",
        "segmentId": "1E",
        "startTimeMs": 153520,
        "text": "SUGAR PUFFS"
      },
      {
        "id": "1D",
        "segmentId": "1D",
        "startTimeMs": 5442480,
        "text": "FROSTIES"
      }
    ],
    "trackingInfo": {
      "viewableId": 80988062
    },
    "impressionData": {
      "type": "userState",
      "data": {
        "persistent": {
          "p_sp": true,
          "p_tt": true,
          "p_8a": false,
          "p_td": true,
          "p_cs": false,
          "p_w1": false,
          "p_2b": false,
          "p_3j": false,
          "p_pt": false,
          "p_cd": false,
          "p_cj": false,
          "p_sj": false,
          "p_sj2": false,
          "p_tud": false,
          "p_lsd": false,
          "p_vh": false,
          "p_3l": false,
          "p_3s": false,
          "p_3z": false,
          "p_ps": "n",
          "p_wb": false,
          "p_kd": false,
          "p_bo": false,
          "p_5v": false,
          "p_pc": "n",
          "p_sc": false,
          "p_ty": false,
          "p_cm": false,
          "p_pr": false,
          "p_3ad": false,
          "p_s3af": false,
          "p_nf": false,
          "p_np": false,
          "p_ne": false,
          "p_pp": false,
          "p_tp": false,
          "p_bup": false,
          "p_be": false,
          "p_pe": false,
          "p_pae": false,
          "p_te": false,
          "p_snt": false,
          "p_8j": false,
          "p_8d": false,
          "p_8m": false,
          "p_8q": false,
          "p_8s": false,
          "p_8v": false,
          "p_vs": "n",
          "p_scs": false,
          "p_3ab": false,
          "p_3ac": false,
          "p_3aj": false,
          "p_3ah": false,
          "p_3ak": false,
          "p_3al": false,
          "p_3af": false,
          "p_5h": false,
          "p_5ac": false,
          "p_5ag": false,
          "p_5ad": false,
          "p_6c": false
        }
      }
    },
    "uiInteractionStartMS": 139520,
    "config": {
      "intervalBasedVideoTimer": true,
      "disableImmediateSceneTransition": true,
      "disablePrematureUserInteraction": true,
      "hideChoiceLabelWhenChoiceHasImage": true,
      "randomInitialDefault": true
    }
  }
]

There’s a lot to unpack here, but there are three things that stand out. For one, there are multiple definitions for start and end times in milliseconds. These values seem to define specific sections of video that relate to both the segment preceding a choice and the different choices being made. It’s also notable that segment 1A has only one set of definitions of a type “scene:cs_bs”.

We can also see that a userState array of impressionData looks similar to the stateHistory list of parameters. Since this data seems to be independent of which choice is selected, I’m guessing this state data is updated before a choice is even made, maybe at some point while the preceding video segment is playing.

Let’s take a look at the details for one of the two choices, segment 1E (aka Sugar Puffs):

"1E": [
  {
    "type": "notification:playbackImpression",
    "startMs": 153520,
    "endMs": 207240,
    "precondition": [
      "not",
      [
        "eql",
        [
          "persistentState",
          "p_sp"
        ],
        true
      ]
    ],
    "impressionData": {
      "type": "userState",
      "data": {
        "persistent": {
          "p_sp": true
        }
      }
    }
  },
  {
    "type": "scene:cs_bs",
    "startMs": 190640,
    "endMs": 207240,
    "activationWindow": [
      190640,
      203240
    ],
    "id": "1E",
    "layoutType": "l2",
    "uiDisplayMS": 194280,
    "uiHideMS": 203240,
    "defaultChoiceIndex": 0,
    "choiceActivationThresholdMS": 190640,
    "choices": [
      {
        "id": "1H",
        "segmentId": "1H",
        "startTimeMs": 207240,
        "text": "THOMPSON TWINS"
      },
      {
        "id": "1G",
        "segmentId": "1G",
        "startTimeMs": 5496880,
        "text": "NOW 2"
      }
    ],
    "trackingInfo": {
      "viewableId": 80988062
    },
    "uiInteractionStartMS": 194280,
    "config": {
      "intervalBasedVideoTimer": true,
      "disableImmediateSceneTransition": true,
      "disablePrematureUserInteraction": true,
      "hideChoiceLabelWhenChoiceHasImage": true,
      "randomInitialDefault": true
    }
  }
]

From segment 1A, we know that segment 1E is the choice for Sugar Puffs. Unlike segment 1A, we see two sets of definitions: one for a type “scene:cs_bs” (same as segment 1A) and one for a type “notification:playbackImpression”. The definition for this second type includes a precondition on the state p_sp and a userState update of that same state parameter. I’ll go out on a limb and claim that the “sp” in p_sp stands for “Sugar Puffs”.

The definitions for the “scene:cs_bs” include start and end timings, as well as data on the next choice at the end of this segment, specifically the choice of music tapes between Thompson Twins (segment 1H) and Now 2 (segment 1G).

It’s interesting to note that the data for segment 1D (Frosties) looks very similar to segment 1E, including the data for the subsequent choice between segments 1H and 1G, with the notable difference being that the state parameter p_sp is set to false (i.e. not Sugar Puffs).

We’re starting to see a bit of a trend here, so let’s summarize what we’ve found so far. From what we’ve seen, segment definitions can include:

  • Start and end times related to some kind of video sequence
  • Preconditions of state parameters related to playback impressions (whatever those might be)
  • Updates to state parameters
  • Choices (either 1 or 2) that can occur at some point in a segment, and the ID’s for segments related to those choices
  • One or more types of definitions for each segment, including the following types:
    • “notification:action”
    • “notification:playbackImpression”
    • “scene:cs_bs”
    • “scene:cs_bs_phone”
    • “scene:interstitialPostPlay_v2”

In regards to these segment definition types, there seem to be two notification types and three scene types. For the time being, I won’t go into detail on how these segment definitions can differ.

Preconditions

Our next component type is a precondition. Let’s jump ahead a bit in the narrative of Bandersnatch to explore this new component. At one point in the sequence, Stefan visits Dr. Haynes office and the viewer is faced with a choice for Stefan of Biting Nails or Pulling Earlobe.

Scanning through the momentsBySegment list for this choice, we find that it’s defined as segment 3R. Then, looking for that segment ID in the list of preconditions, we find this definition.

"3R": [
  "not",
  [
    "persistentState",
    "p_vh"
  ]
]

Seems simple enough. This segment has a precondition on only one state variable, the state p_vh (vh = Visit Haynes?). However, it’s not clear what this precondition does exactly. Is it related to a playback? Or a scene?

Looking at the long list of preconditions, it seems that they can range from simple logic expressions involving only one state to more complex expressions involving many states. Although we don’t know exactly how these preconditions are used, it’s fair to assume that they are static definitions that act as a flow control of sorts for the branching segment structure, opening and closing narrative pathways depending on the different state values.

Segment Groups

Let’s stick with segment 3R for a second and take a look at our last set of components in our JSON Graph that seem important, segmentGroups. This component seems to be a way to organize the structure of how segments are connected. Some segmentGroups have a static set of segments making up a group, whereas others have dynamic definitions based on preconditions.

For example, segment 3R shows up in two different segmentGroups, VisitHaynesChoice and 3Q. The group VisitHaynesChoice is a collection of 6 different segment ID’s.

"VisitHaynesChoice": [
  "ZP",
  "ZQ",
  "3Xa",
  "3Xac",
  "3S",
  "3R"
]

And the segmentGroup 3Q is a collection of 2 segment ID’s, where one of them (3S) appears to only be included in the group based on a precondition criteria labeled 3S_s3Q.

"3Q": [
  {
    "segment": "3S",
    "precondition": "3S_s3Q"
  },
  "3R"
]

Looking at other segmentGroups, we see ones statically defined like VisitHaynesChoice, others dynamically defined with preconditions like 3Q, and even some that include other groups within a segmentGroup.

If this isn’t all confusing enough, it should be pointed out that ID names can exist as either specific segments, segmentGroups, preconditions or momentsBySegments, and can either exist as only one of these items or share the same ID across multiple items. So whereas the ID 3S_s3Q is just a precondition, the ID 3Q is both a segment and a segmentGroup.

Mapping It All Out

Mapping out the complete flow of all of these different segments, groups, preconditions and states is a daunting task. Way to try and make sense of all of this is to show how the logic and data structures above tie in to specific video segments.

Fortunately, a Reddit post by user iamthecage not only shows us that the episode is in fact a single 5+ hour long “master” video, but also found a way to programmatically jump to different parts of the video. The post includes some JavaScript code that can be entered into a Developer Console allowing you to manually jump to specific times in the video (in milliseconds). I’ve copied this code snippet below for convenience if you want to try it out yourself.

document.evaluate('//*[@id="80988062"]/video',document).iterateNext().setAttribute("controls", "controls")

netflix.appContext.getPlayerApp().getAPI().getOpenPlaybackSessions()[0].currentTime

const videoPlayer = netflix
  .appContext
  .state
  .playerApp
  .getAPI()
  .videoPlayer

// Getting player id
const playerSessionId = videoPlayer
  .getAllPlayerSessionIds()[0]

const player = videoPlayer
  .getVideoPlayerBySessionId(playerSessionId)
  
player.seek(1577120)

(Credit: iamthecage)

Using our example of segment 3R from above, we can map out some of its related video clips using start and end times detailed in both the choicePoints and momentsBySegment components (shown in the image below).

Segment 3R

This shows how these data components (in JSON format) are defined and how they relate to specific points in the master video for this specific segment 3R. Looking at it another way, we can show how the state parameters and preconditions trigger different combinations of video segment types (scenes, playback impressions, etc) and their associated choices.

Master Video

From this, we can start to see how these components work together in mapping out the story flow. And how they’re used to jump around to different segments of the master video depending on which choices are made and the overall state of the interactive history, somehow all resulting in a seamless media experience.

OK, This Seems Complicated, Right?

Looking back at some of the flowcharts I’ve seen on Reddit, I have to wonder if they’re missing some narrative pathways, specifically because they seem to have a rudimentary account of all of the the 62 state parameters. A count of items in each of the main components we’ve explored show just how complex and large of a structure Bandersnatch appears to be under the surface:

  • Preconditions: 241
  • momentsBySegment: 208
  • segmentGroups: 111
  • stateHistory: 62

With so many segments and state parameters, there’s the potential for a lot of narrative variability depending on the complexity of preconditions for the various video segments.

To handle this complexity, we know that Netflix had to create a new piece of software called Branch Manager to build out the non-linear narrative. And showrunner Charlie Booker describes the difficulty in mapping out Bandersnatch using just a flow chart:

“You couldn’t do this in a flow chart because it’s dynamic and tracking what state you are in and doing things accordingly,” Booker explained, who could “input and deliver his evolving script directly to Netflix” with the nifty tool.

Another interesting aspect of all of this is that Netflix somehow managed to implement these interactivity components within the context of their existing streaming infrastructure. They took a 5+ hour video, built a complex state/precondition/segmentation layer on top of it and then developed a process to jump back and forth to different points in the master video, all without any video buffering. The fact that they were able to do all of this relying on the same Shakti API services that they use for other Netflix content really speaks to the robustness and versatility of the services that they’re building.

Source Code and Akira

Ok, so we’ve seen how Netflix used their Shakti API to deliver initialization data that defines the entire interactive narrative and its video structure. But how does that all work together to actually manipulate the video that’s being watched and provide for a seamless media experience? Did they build something new to handle the video segmentation aspect of interactivity? To look into this, we need to check out some source code on the client side.

Looking through script tags in the main HTML page of the Bandersnatch title, I came across an interesting JavaScript library named Akira. Unminified, it’s over 89,000 lines of code, so not a small library by any means. The request URL below loads this client-side library:

https://codex.nflxext.com/%5E2.0.0/truthBundle/webui/0.0.1-akira-js-mk-v6faf08f4/js/js/akira%7CakiraClient.js/2/4u4A494b4k06424f4z040n004B4e474i4h4c4p4s4n444g4y114v/l/true/none

I couldn’t find any reference to this specific Netflix library online, so I’m not sure if it’s something new or something that Netflix has been using for a while. But looking at another episode of Black Mirror (a non-interactive episode), we see a similar Akira library being loaded via the request URL below:

https://codex.nflxext.com/%5E2.0.0/truthBundle/webui/0.0.1-akira-js-mk-v6faf08f4/js/js/akira%7CakiraClient.js/2/4u4l4A494b4k06424f4z040n004B4e474h4c4p4s4n444g114v/l/true/none

Although the URL’s look similar, we see a long character string (maybe a hash?) that look slightly different in each of the two requests.

Interactive:     4u4A494b4k06424f4z040n004B4e474i4h4c4p4s4n444g4y114v
Non-Interactive: 4u4l4A494b4k06424f4z040n004B4e474h4c4p4s4n444g114v

Unminifying the non-interactive Akira library shows just under 83,000 lines of code, about 6,000 fewer lines than the interactive version. And a cursory comparison of these libraries shows that the interactive Akira library has references to the four interactive components we’ve explored (preconditions, stateHistory, momentsBySegment and segmentGroups), while the non-interactive Akira library does not.

So it appears that this Akira library is handling most (or all) of the client-side functionality of Netflix’s “state tracking” technology, in theory updating state values, evaluating preconditions and jumping between video segments, all based on the data components loaded by the Shakti API.

While it’s unclear whether or not Netflix created the Akira library specifically for Bandersnatch, the episode contains a subtle (if not anachronistic) nod to Katsuhiro Otomo’s hit 1982 manga of the same name. During Stefan’s visit to Colin’s apartment, a large black and white print showing the destruction of Neo-Tokyo looms large in the background. I’d like to think it’s a reference to the JavaScript library that seems to have made Bandersnatch possible.

Akira Netflix

UPDATE: It turns out that Netflix has been using the Akira client for a while now! I found a 2015 article about Netflix interactions where then Director of Engineering Operations Josh Evans provides the following comment:

“We created what we call our ‘Darwin’ user interface, moving form vertical to horizontal box shots, and we tuned our algorithms … a lot of innovation went in,” says Evans. The same kind of interface is on the website, too, creating what Evans calls the ‘Akira’ user interface. “All the information you need is at your fingertips,” he says. It sounds simple, but it’s built on advanced telemetry, real-time analytics and advanced machine learning.

So while it seems that Netflix has been using the Akira user interface for a while now, I still think it’s important to note that they serve up two different versions of the client library, depending on whether the title is interactive or not.

Through the Looking-Glass

It’s clear that Netflix put a lot of effort into developing this interactive technology for Bandersnatch. To do this, they seem to have built out a narrative-specific JSON Graph, more than 6,000 lines of new client-side code and even a new Branch Manager tool to lay out the complex narrative structure of the episode.

This write-up is an attempt to make sense of some of that work, to explore and wrap my head around the details that went into the new streaming technology behind Bandersnatch. It’ll be interesting to see what new interactive titles Netflix might develop next, and whether or not they will continue to build out the technology and concepts explored in this post.

Thanks for reading!

Also published at Medium.com on December 29, 2018.

Posted on:
December 30, 2018
Length:
20 minute read, 4114 words
See Also: