In this Session…
Before you begin…
To follow along, download the files:
How To GraphXR 4. Link and Filter
Before You Begin…
Ideally, you’ll have worked through Module 3. Properties and Extract. If you’re starting here, and you want to follow along, you’ll need to:
So far, we’ve extracted a House category from the Characters.csv data and created a belongs_to relationship linking House to Character nodes. Now drag and drop Lines.csv onto the graph space. It includes the total number of lines and words spoken by each character in every episode of HBO’s Game of Thrones.
You might want to zoom out—at over 3,000 nodes, this is a much larger dataset than Characters.csv. At any time, you can open the Table panel to see the graph and its properties in spreadsheet format. You can also view speciﬁc categories, relationships, or search by keywords. Do you notice the speaker property on Lines?
The values in speaker should match the values in Characters’ characterName. We can use these properties to link our characters to their line counts for each episode. Go to the Transforms panel => Link tab.
The Link transform enables creation of new edges belonging to a new or existing relationship. Let’s set Characters-spoke-Lines using characterName as the source property, and speaker as the target property, then click Run.
You’ll notice there are numerous nodes with no connections. These correspond to lines by characters like “King's Landing Man #7” who didn’t make it into the Characters.csv source data. Let’s clean up these extraneous nodes. Before we do, though, let’s turn on Snapshots to save our graph state in memory.
Go to the Project panel => Settings tab and enable Show Snapshot. Snapshots are a way of saving the current graph state in memory (rather than creating a view in the Data tab, which saves to the GraphXR server). This is useful for creating a history or library of graph states. Click the plus sign to capture a Snapshot.
You can click the arrow to show thumbnails of the snapshots you’ve taken so far. Now we can go back to cleaning up unconnected nodes in our graph.
Open the Algorithm panel. We’ll use the Degree centrality algorithm to write a measure of node connection to a new property gxr_degree, then use it to select and remove nodes with a zero value. In the Centrality tab, click the Degree button. The algorithm returns the value for all nodes.
Now select the Property tab in the Legend and select gxr_degree from the dropdown menu. In the list of property values locate the gxr_degree value of 0. Click to select those nodes then press delete, or the Delete icon, to delete them.
Now only nodes which have at least one connection remain. Alternatively, you can use a filter to accomplish the same thing. Let’s see how.
First, let’s load our saved snapshot. Open your snapshot library, locate the snapshot, and click the cloud icon to restore the graph with unconnected nodes.
Open the Algorithm panel and the Centrality tab, and click Degree to generate gxr_degree again. Now go to the Filter panel.
Open the Node Properties dropdown and select the gxr_degree property.
Set the Max value for gxr_degree to 0 to ﬁlter out any nodes with one or more connections. We can select all visible nodes by pressing ctrl+a, clicking Select Fully Visible Nodes, or clicking the zero-value item the Legend. Then press delete or click Delete in the context menu. Now press Del to clear your ﬁlter.
What’s left are only nodes that have connections. Let’s take another Snapshot (or save a Data View or a GXRF file). Next we’ll work with Episodes.csv in Module 5. Aggregate and Merge.
How To GraphXR: Module 5. Aggregate and Merge.