5. Aggregate, Merge, and f(x)
In this Session… | Before you begin… |
| To follow along, download the files: |
Slide | |
---|---|
1 | How To GraphXR 5. Aggregate and Merge |
2 | Before You Begin… Ideally, you’ll have worked through Module 4. Link and Filter. If you’re starting here, and you want to follow along, you’ll need to:
|
3 | So far, we’ve extracted House nodes from Characters.csv data, created BELONGS_TO edges connecting them with Character nodes, created Lines nodes, and SPOKE edges connecting Characters nodes to the Lines they spoke. |
4 | Now we’ll use Merge and Aggregate transforms to link lines of dialogue with the broadcast episodes they were spoken on. Drag and drop Episodes.csv onto the graph space to create the nodes of a new Episodes category. |
5 | The Episodes.csv file contains the titles, descriptions, air dates, episode numbers, season numbers, and viewership for each broadcast episode. We’ll now connect lines of dialog to corresponding episodes. |
6 | We have a little problem, though. In Lines, season and episode number are combined into the single string called seasonEpisode. In Episodes, they occupy separate properties. To fix this, we’ll use a transform to generate a new seasonEpisode property on our Episode nodes. Open the Transform panel and f(x) tab. |
7 | The f(x) transforms let us run javascript formulas on properties of a category or relationship, and place the result in a new (or existing) property. First, select the Episodes category and episodeNumber property from the dropdown menus. |
8 | Choose toCustom, since we’ll need to enter a custom formula (rather than one of the preset formulas). Enter seasonEpisode as the new property name. |
9 | Enter the custom formula: (propVal,props) => 'S'+props.seasonNumber+'E'+props.episodeNumber An example result is automatically displayed under the New Property Name. |
10 | Click Run and scroll to the bottom of the panel to view the results. Now that we have property values that match, we’ll link Episodes and Lines nodes. |
11 | Open the Link tab and choose Lines as the source category, Episodes as the target category, and enter SPOKEN_ON as the new relationship. Select seasonEpisode as the source and target property, and click Run. |
12 | The graph now reveals connections and properties we can explore further. Take a Snapshot, and then we’ll use Aggregate to add new properties based on existing ones, and Merge to simplify the graph. |
13 | Let’s find the number of lines per episode. Lines nodes have a lineCount property for each speaker and episode. With the Aggregate transform, we’ll sum those values and write the total to a new totalLines property on connected Episode nodes. |
14 | Go to the Transform panel and Aggregate tab. |
15 | Click Property from Neighbor Nodes, and select lineCount. |
16 | Let’s also use Aggregate to calculate the number of unique characters per episode. Again, select the Episodes category and SPOKEN_ON relationship. |
17 | Click Property from Neighbor Nodes, and select the speaker property. Under New Property enter totalCharacters, under Formula Name select count, and click Run. |
18 | Now display a table to see the new totalLines and totalCharacters properties. |
19 | Now let’s simplify the graph using Merge. It combines nodes of a single category or the edges of a single relationship based on a property value. We’ll use the seasonEpisode property to merge Lines nodes for an episode into a single node. |
20 | Go to the Transform panel and Merge tab. |
21 | Now click Run. With only one Lines node per Episode, the simplified graph now clarifies connections between Episodes, Characters, and Lines of dialogue. |
19 | Save a data view, take a snapshot, and download the snapshot archive. |
Next Steps…
How To GraphXR: Module 6. Shortcut.