Category Archives: Uncategorized

Cypher and Neo4j: Part I

A few months ago I started working with graph databases. This post is part of a series aimed at documenting how to work with a graph database, particularly for those coming from a relational database background.

At a practical level, when first working with a database you want to know how to get it installed and running (which was the subject of an earlier post) and then how to do the basic CRUD operations: creating data, retrieving it, making updates, and then deleting things. The purpose of this post is just to focus on using Neo4j and the query language particular to that database, Cypher.

One of the very nice features of Neo4j 2 is that they included a very friendly way to interact with the database by just pointing your web browser at it. This is not how you will work with Neo4j at scale but when learning to use it, it is invaluable.  When working within the browser you generally enter one statement at a time. If you try to put two statements separated by a semi-colon, Neo4j will get confused.

In Neo4j you work with a property graph data model, which is to say a set of nodes with connecting edges, both of which can have properties — or attributes if you prefer — attached to them. In Neo4j nodes and edges can also have Labels, which you can think of as allowing you to define a type of node or edge. (This is in contrast to Titan, where edges can have labels, but not nodes.)

The first thing to note is that in Neo4j, you do not need to set up a schema first before you can start loading in and using the database. In fact there isn’t really a concept of a schema like you would have in a relational database, and in Cypher there isn’t an equivalent of the Data Definition Language (DDL) that we have in SQL; there aren’t equivalents for ‘create table’, ‘drop table’, ‘alter table’. If you are coming from primarily a relational database background then this feels a bit odd certainly.

Creating a Node

The example below shows how we might create a single node with a couple of attributes. Note that the indentation here is purely to enhance readability — Neo4j will process this fine if it is all on the same line.

CREATE ( Person { name: "Alice",
                  email: "alice@wherever.com"
                }
      )

Cypher is deliberately designed to feel like SQL. Instead of SQL’s ‘INSERT’ though we add new nodes to the database with a ‘CREATE’ command. Now we’ll deconstruct the rest of this statement: The ‘Person’ part is a label that allows you to identify the particular kind of node this is.  Unless you have a graph in which all the nodes are the same type, then you will likely want to add a descriptive label here. The label needs to start with a letter but it can be composed of letters and numbers.  All the properties for this node are in curly braces as a comma-separated list of key-value pairs, with a colon separating keys from values (instead of an equals sign.)

Note that there is nothing here in this statement that identifies a primary key. Nor do we have a concept of referential integrity between kinds of nodes. Enforcing uniqueness is possible using the MERGE command, which we’ll get to in a later post. Internally in Neo4j there is a unique node id and you can make use of that, but you shouldn’t rely on that much.

Just for comparison, if this was a relational database the equivalent statement in SQL would be something like:

INSERT INTO Person(name, email) VALUES('Alice', 'alice@wherever.com');

Searching for a Node

Now that we have a node in here, how would we search for it? In general we can search using Cypher’s MATCH statement, which you can think of as the equivalent of SQL’s SELECT. Like SELECT, we use MATCH all the time when working with Neo4j.

MATCH ( k{name:"Alice"}) RETURN k

In this statement we have a ‘k’ in there before the attribute list. That is a variable that we can use elsewhere in the statement, and in fact we use it at the end in the RETURN clause to actually return the value. In fact, MATCH requires a RETURN, SET, or DELETE clause at the end, otherwise it the statement is considered incomplete.

If you run this command in the browser, Neo4j 2.x will give you a D3-based visualization of your result set. You can click on the node(s) to show all the attributes. This kind of feedback makes learning and developing your graph statements in Neo4j very helpful in fact.

Our statement here returned the entire node. If you want just a particular attribute you can return that.

MATCH ( k{name:"Alice"}) RETURN k.email

Updating and Deleting a Node

If we have a node that we just want to update a value in, or add another key-value pair? Again, we use MATCH but this time we end with a SET clause.

MATCH ( k{name:"Alice"}) SET k.email='alice@wherever.com'

In this case we updated the email column. If we wanted to add a new property we would just list it — there is no syntactic difference between updating an existing property or adding a new one. In SQL we would first have to alter the table to add a place for the new column and then we could set a value.

Deletion is very similar to updating — we just specify to delete the node at the end instead of returning it.

MATCH ( k{name:"Alice"}) DELETE k

 A point on deleting nodes — Neo4j will not let you delete a node if it still has edges connected to it. You first have to delete the edges and then the node. But as we’ll see you can do that in one statement.

Creating Relationships

So, this is a graph database. A graph database with only nodes is kinda dull and uninteresting really. So how do we create connections?

First let’s add a few more nodes to our system for demonstration purposes. We’ll also re-create ‘Alice’ since we deleted that node above. And, while we’re at it, we’ll also do this in one statement to show how to add multiple nodes at a time.

 CREATE ( p0: Person { name: "Alice",
                       email: "alice@wherever.com"
                      } ),
        ( p1: Person { name: "Ezekial",
                  email: "zeke@nowhere.com"
                } ),
        ( p2: Person { name: "Daniel",
                   email: "dan@nowhere.com"
                  } ),
        ( p3: Person { name: "Bob",
                   email: "bob@nowhere.com"
                  })

A couple things to note here: we added a variable before each person we added (p0, p1, p2, p3). When adding multiple nodes we need to add something to distinguish between these, and as we create more involved queries the utility of that will become more evident.  For now take it as read that if you omit that, Neo4j will complain that ‘Person’ was already declared.

Now let’s find ‘Alice’ and create a relationship to Bob.

MATCH ( p1 {name:"Alice"}), ( p2 {name:"Bob"}) CREATE (p1)-[r:IS_FRIENDS_WITH]->(p2)

So… this takes a little deconstruction. We started with our MATCH statement but instead of just retrieving one node we retrieved two. This is where those variables — p1 and p2 — come into place. You can think of them as being kinda/sorta like aliases in SQL.

 Once we find the two nodes we can create the link between them. Edges are always directed edges in Neo4j, and the edge is represented with the ‘start’ node followed by the relationship label and then the second node. The usual way of describing this is to think of an arrow connecting the two, as you might write it in ascii-art:  ‘(first node)-[r:RELATIONSHIP_LABEL]->(second node)’.  That ‘r’ is arbitary but you do need a variable there, otherwise Neo4j will give you an error ‘A single relationship type must be specified for CREATE’.

 Searching on Relationships

At this point we have something that is becoming a more meaningful graph, albeit a small one. We have a few nodes and a relationship between a couple of them.

MATCH (p1 {name:"Alice"})-[r:IS_FRIENDS_WITH]->(p2) RETURN p2

Again, we use MATCH like we would use SELECT in a relational database. In this case we specify the relationship with that ‘arrow’-like syntax. You’ll notice that we specified p1 to be ‘Alice’ by specifying the attribute, but we didn’t do so for p2 — p2 is what we want to find in this query. When you run this you should see just one node returned, ‘Bob’.

Part 1 Summary

At this point we have covered the very basic operations involved in creating, updating, and deleting nodes, and we started in on how to create and query on edges. In this next post on this topic we’ll continue the discussion on setting up edges and more involved queries.

Getting started with Bukkit plugin development

This post pertains specifically to Bukkit, which is no longer available due to a DMCA takedown request, and the future of Bukkit is somewhat in question now unfortunately.

Getting started developing in any platform always has a learning curve as you find your way around the language(s), tools and conventions involved. Developing plugins for Bukkit is no different. There are a number of good tutorial resources on this topic, so rather than do a full write-up we’ll link to a couple of tutorials and then add some additional comments to outline the process and identify some of the things to watch for — the ‘gotchas’ of plugin development.

The general process for getting started can be enumerated in these steps:

Install the following tools to set up your development environment.

  • Download Oracle’s Java SE JDK (the JRE alone is not sufficient). Go ahead and install that first since you will need this to run Eclipse.
  • Download and install Eclipse (if you are already set up to do Java development with another IDE, such as NetBeans, that’s great. In this post we’ll focus on Eclipse though.)
  • Get the Bukkit jar file. (craftbukkit-1.7.10-R0.1-20140804.183445-7.jar) [Which you can’t get anymore]

You will find several tutorials (both documents and videos) with a bit of searching online. But two pretty reasonable tutorials to get your started are here:

Bukkit Coding for Dummies — a 10 page Google Doc covering pluggin development using Eclipse. This has some useful screen shots and code examples.

How to make a Bukkit plugin — This site has several similar examples, also with screen shots and code examples. To pick one tutorial, take a look at Part 10 on invisibility and custom books.

These tutorials will give you the details. For the sake of having a more general overview though, the steps for creating a plugin is basically this:

1. Launch into Eclipse and select your workspace — some directory somewhere on your computer that you can let Eclipse more or less have free reign in. Eclipse will use that directory

2. Create a new Java Project.

3. Add a new Package. This is basically your java source file. The convention in the world of Java is to name packages using a reverse-url, so you might use com.myfullname.bukkitone. (And no, you don’t need to really have the name registered or anything like that… although it is a nice touch if you do.)

4. Write up your plugin (see the tutorials above for more details there.) Be sure to import the appropriate libraries as needed.

5. Create your YML file for your Pluggin. This is where you will list the commands your plugin recognizes.

6. Compile your code into a JAR file. Select your project in the Package Explorer and then Select Export -> JAR File. Uncheck .classpath and .project from the resources to export, and select the location for Eclipse to save your jar file. You can click through to the next couple of screens but the defaults after this should be fine. You can also just hit ‘Finish’.

7. Copy your jar file to your Minecraft server’s plugin directory and start up Minecraft. Minecraft should recognize the plugin and load it. You should see this if you tail the logs/latest.log file.

Things that may trip you up….

Q1) I can’t compile my plugin but my code looks right? Except that Eclipse doesn’t recognize JavaPlugin and a bunch of other names?
A1) You probably don’t have the craftbukkit jar file loaded, or it is somewhere that Eclipse can’t find it (this may be the case if you cloned a plugin from a Git repo and the Eclipse project is referencing a jar file in a path that doesn’t exist on your computer.) You can correct that by simply adding the jar file (which you should have downloaded by now.)
  1. Go to your projects properties and go to the Java Build Path section.
  2. Click on the Libraries tab.
  3. Click on the Add External Jars button and find the craftbukkit jar file on your computer.
Q2) Why is my plugin not running on my server? It runs okay on my laptop?
A2) This is generally due to a mismatch between the version of Java that your plugin was compiled for and the version you are running on your server. Particularly if you see a message in your error log that looks something like:
Could not load ‘plugins/MyPluggin.jar’ in folder ‘plugins’
org.bukkit.plugin.InvalidPluginException java.lang.UnsupportedClassVersionError de/nofear13/craftbukkituptodate/CraftBukkitUpToDate : Unsupported major.minor version 51.0
You can’t load a plugin compiled with Java8 on a server where you are running Bukkit with java7. So what are you supposed to do if you can’t control which version of java is running on the server?
The short answer is that you have to compile your pluggin to target Java 6 or higher. In Eclipse you should do the following:
  1. Go to your projects properties and go to the Java Compiler section.
  2. Uncheck ‘Use compliance from execution environment…’
  3. Set Compiler Compliance Level to 1.6
After that try re-building your jar file and try that.
Q3) In my java code I added a command, but I can’t invoke the command?
A3) Besides the Java source you need to list commands in the plugin’s yml file (a lot of examples don’t note this at all)

A peek into the world of game developers

It can be interesting to see what the folks involved in another industry are discussing. This week the Game Developers Conference has been going on in San Francisco, and while I didn’t attend GDC (seeing as I don’t work in the games industry even ) I did head up to check out a couple of events taking place around GDC.

The first was an unconference taking place at the Yerba Buena Center for the Arts called Lost Levels. This was the first unconference I’ve really attended and the format was pretty interesting. The venue was the lawn area at YBCA on which three large tarps were placed to designate three ‘session areas’ (my term — not their’s, as far as I know). People signed up on a board to hold forth on some topic for around ten minutes. The organizers of the event would time speakers and call up the next speaker in turn while people gathered around one session area or another to listen in. By and large this was a simple system that seemed to work quite well — people could have their say on a topic and if others wanted to talk with them more they could follow up afterwards.

So, what can I say about the actual talks given at Lost Levels? There were a wide variety of talks and there was no way that I was going to be able to hear all or even most of them. I definitely do not want to give the impression of what the whole unconference was like — just the talks that piqued my interest.

  • At least a couple of speakers discussed interactive fiction (or, perhaps slightly more inclusive, interactive narrative). The impression I got was that this was being ignored by most of the game development community — a point that I’m not informed enough to comment on.
  • Another interesting talk challenged the limitations of character identities in games. Often a player is encouraged to create and identify with a character in a game, but at the same time there are often very few ways that players can really modify their character. The example brought up was gender, which is often simply a male or female option. By now we should know that is pretty limiting for a lot of folks. The speaker challenged game developers to build in more flexibility for players.
  • Another theme that came up in at least a couple of talks was the idea of non-competitive or non-confrontational games. Probably for most gamers the whole idea of a game is defined in some way by competition and conflict. There are game that do not necessarily have this element, and the speakers were encouraging gamers to pay more attention to these and for game creators to think about creating more non-combative games.
  • Perhaps the most interesting talk I heard was on how the gaming industry operates. This is a demanding industry that is notorious for burning out game developers. Driving that is a rather ruthless economics where a game is considered a loss if it hasn’t reached a certain level of sales. Some of the things the speaker was referencing flew by me but no doubt they would have been understood by others in the audience.

A number of the participants were clearly involved in the games industry making games at some level, either as game designers, artists, programmers, art directors, project managers, in small startups or larger, more established companies.

The second event was a smaller session The Future of Games and Entertainment at Swissnex San Francisco. This was a good event to meet others and do a bit of networking. Chuck Eyler gave a talk drawing on his own career in the film and gaming industries. On display were several interesting games that visitors could play.

While there were more games played on PCs or iPads, the main thing I took away from that event were the games that utilized different ways for players to interact with them. One game used small boats on a shallow pool of water, on which was projected a series of dots surrounding the boats. Each player fired the boat’s gun by blowing into a small round metal tube on their end of the game board. Another interesting game — although I’m not sure I’d call it a game exactly — was a picture where the subject (in this case a small child) ‘wakes up’ when a viewer approaches it and starts to mimic the facial expressions of the viewer in a manner that was both interesting yet disturbing at the same time. I’m not sure if this qualifies as an ‘uncanny valley’ type of experience, but it reminded me of that.

Games have by now become a huge thing — not just as an industry but in terms of our collective culture. I think it’s safe to say that gaming produces new sub-cultures. This has been building for a long while. There are clearly a lot of people thinking about the issues that are arising in gaming. There is a good amount of self-reflection starting to take place. Yet, I can’t help but get the impression from my day attending these events that there is a profound lack of theory here. Really I should say that there is a lack of familiarity with existing theories from the domains of cultural anthropology, economics, and so forth, as well as a lack of theories about the nature of gaming itself. Perhaps it is that it is still too early for that to have happened. Admittedly, it could be happening but it wasn’t evidenced by my day being a tourist on the fringes of the world of gaming.

Which programming language/tool/other technology to learn?

When I was an undergrad studying computer science I remember walking with a couple of friends and fellow students through one of the buildings on campus on night. As it happened it was the building where the computer science department was located. We were (probably) heading to one of the computer labs, and as it happened the chair of the department had come out of the elevator and was on his way out of the building. We said hello and he then decided to take the moment to share a bit of advice, which was to learn Java. That was it, fairly simple. I think at the time we looked at each other and more or less shrugged our shoulders and got on with things.

Somehow I’ve remembered that bit of advice from our department chair since then (although certainly I didn’t act on it at the time in a meaningful way). That was about twenty years ago — perhaps 1993 or 1994 — and at the time Java was still a pretty new language, and none of our regular computer science courses was using the language. With a few exceptions most of the programming I was doing in my classes as an undergrad was in C. My introductory programming class in college used a language called Modula-2, which probably very few remember now (and I believe that was the last semester they taught that class with Module-2, switching to C afterwards.) There was also a course on programming language design introduced us to Scheme (a close cousin of LISP). The point though was that the department chair recognized that this was going to be an important language and was worth investing the time to learn it.

When getting started in this business there is an incentive to try to learn a lot of different languages, or development tools. After all, the more one knows, the more potentially valuable when hitting the job market. Or at least we might be inclined to think — a point I’ll return to later. But there are any number of things one might spend time learning and only a subset of those will prove to actually be useful. This is a classic problem of course — how do you decide what to focus on and what to set aside? A couple of years ago when I was teaching I had a student more or less ask this question as well. I think since I was an undergrad the problem has become much more challenging. There are simply more languages and applications in active use now. We have more choices for databases, and mobile app development didn’t even exist ten years ago.

Trying to decompose this problem is worth doing I think. For any given language, tool, or other bit of technology we might pose a few questions:

How widely used is it now?

A widely used technology can be comforting to get into since you may assume there will be more resources to help get up to speed with it, as well as a broader community to tap into. If we are talking about programming languages, then you might look to see which ones there seems to be more demand for. The Jobs Tractor Language Trends – February 2013 report for instance shows Java and PHP being more popular now. But a narrow market segment isn’t necessarily bad either. In some niche areas a less widely used language or technology might be the dominant one in that area. Also, new technologies have to start out somewhere and can sometimes take a while to find their audience, which leads to the second point.

Potential for growth?

Besides current popularity another variable is what kind of growth might be expected for people that know Technology X.

Don’t confuse how widely used something is with demand — they are related but not the same thing. There’s still a market for COBOL and Fortran developers, for instance.

Open vs. Proprietary

Right now there is a division between open technologies and closed, proprietary ones. With proprietary technologies you can expect your investment cost to go up if only because of the need to acquire the necessary software (and license).

Getting into mobile app development is a good case study here. Becoming an iPhone developer, for example, requires a certain up-front investment: you need a) a Mac to have XCode and the whole development environment, b) an iPhone and c) join Apple’s iPhone developer program. However, that’s a popular platform to develop for (at the risk of understatement) so yeah, it’s pretty compelling. It should be said, given this example, that becoming an Android developer isn’t free either — but you can get by with a less expensive computer with Linux and you can get Java for free.

Is it something you are interested in?

I think there’s something to be said for pursing things you actually are interested in versus things that you think will be good to know, but are otherwise not that into. You’ll generally do better at things that you are intrinsically inclined towards. At the end of the day I think this is the one to weigh most.

Avoid spreading yourself too thin

Finally, to get back to the point about trying to ‘learn everything’, at a certain point I think it is important to recognize that you can’t do possibly do that , and certainly not at the same time. There is likely an added bonus that comes from having experience with a variety of tools — seeing how different languages handle the same or similar problems can be insightful, for instance — but that kind of knowledge comes over time. Over the span of a career your primary tools are going to evolve anyway.

The list above is just an attempt to try to sketch out how to think about this problem in some systematic way; I certainly don’t think I have any definitive answers here. What I can say is that, at least for myself, I’ve decided that there are certain areas I don’t see myself investing time to learn things and instead focus on other areas. I’m less inclined to get into .Net and Windows development at this point in my career, for instance, since that would be a pretty significant switch from where my current skill set is, and quite frankly I’m not as interested in it. This is certainly not to disparage .Net and Windows development — it’s just a choice of where to spend resources.

W. Edwards Deming’s view of systems

I’ve had a general interest in systems and systems theory and how different people have written about it. Depending on the context different authors will view what a system is differently. Recently I’ve been reading W. Edwards Deming’s The New Economics for Industry, Government, Education. Deming, who passed away in 1993, had a long career as a management consultant. He is probably most known for his work in Japan introducing them to concepts of quality control. The Deming Prize was created in his honor and continues to be awarded every year. (See here for more information on the Deming prize than is available on the JUSE site.)

Deming’s concept of a system is a key part of his overall thinking and is the basis for what he termed the ‘system of profound knowledge’. His view of systems differs a bit from how systems are described elsewhere. To start, Deming is primarily concerned with man-made systems, though he does not go on to elaborate much on how a man-made system differs from naturally occurring systems. His basic point that any system consists of several, interdependent components is not, in itself, a radical departure from other conceptualizations of systems. A component might be a person, and it does seem that often Deming uses the term ‘component’ interchangeably to refer to people in a system. There’s no reason to think that components can’t include other items besides people though — equipment of various kinds, raw materials, etc.

Where he does differ are in other statements about systems and their dynamics. To understand how Deming arrives at his conceptualization of a system it is worth focusing for a moment on his idea of the aim of a system. Any system must, according to Deming, have an aim, and if it doesn’t then by definition it isn’t a system. All the components in the system need to be focused on the aim and pursue that to the best of their ability. He further recommends that the aim of a system should be that all the people involved in the system benefit over the long term from their participation in it.

What is perhaps an unexpected point is that he argues that competition is ultimately a destructive force, and a system must be actively managed to prevent that. If left untended, the components of a system will naturally turn towards pursuing their own individually-based aims at the expense of the overall aim of the system. This is what the role of management was then — to regulate the components and keep them focused on the aim. What is interesting here is that either this means Deming didn’t think that there could be any self-regulation in a system, at least among totality of the components. The other interpretation is that the management in a system constitutes the self-regulatory mechanism. That, however, seems to be a different kind of self-regulation than the emergent phenomenon that arises from the interaction of all the components in a system.

Deming had nothing good to say about competition. Deming explicitly states that competition is ultimately a destruction force, and instead we should strive towards cooperation. At the same time he argues that a system includes competitors, so it does seem that he acknowledges that competition exists. There is perhaps a bit of ambiguity here, but it seems that some form of competition could be beneficial; he describes a positive form of competition when competitors are seeking to expand a market or to provide better service in some way. It does seem that Deming has two views of competitive practice. In the first ‘win-lose’ form, this is ultimately destructive. When companies seek to increase their share of a market or to push other competing companies out, that is ultimately unhealthy. The second ‘win-win’ version would look a bit different: instead of looking to take over more of a market, all the players in some space looked to expand that market so that all would come out ahead. Such a ‘competitor’ is a kind of cooperative competitor ultimately.

The description of systems in The New Economics is not as fully fleshed out as one might hope. There are a number of statements that Deming makes that are simply asserted without any further discussion. The text often refers to another book Deming wrote some years prior, Out of the Crisis and it may be that Deming thought of these books as being companion pieces. Suffice to say in concluding that Deming, given his profession and most of his experience, was primarily interested in systems as they are found in organizations we create. This is clear from the full title of the book, referring to industry, education, and government. While Deming no doubt had a broad view of how systems thinking could be applied, it probably did have limits.

The New Economics for Industry, Government, Education
Second Edition
Deming, W. Edwards

© 1994 The W. Edwards Deming Institute

ISBN-13 : 978-0-262-54116-9

An initial post…

Somehow when it comes to my own site it’s taken me forever to get something up. At least something more than a totally blank page with my LinkedIn and Twitter links; I never seem to be quite happy with any designs I come up with….

So, I’m taking a new tact now. Knowing I probably won’t really be happy with the look of the thing for a while, I’m trying the a basic wordpress install with the thought that I’ll just evolve it over time as I use it. We’ll see how that goes.