Wednesday, 16 May 2012

From naive physics to naive computation



Having done another couple of demos to interested observers, I've realised that the "translate" operation is not that useful. Translation is currently defined in terms of a vector, but the use of translation is most often to move something to the place you want it - the vector required to get it there is a byproduct, not the main point of interest. In fact, getting an object to a specified place is a real pain at present. The steps are:

  • Create a point representation layer. The default behaviour, when a point representation is created over an image, is to use the centroid of the image as the default value, so this works OK.
  • Create another point representation. The default behaviour, when another point is under this one, is to create a new randomised value. This is also OK, because an arbitrary translation can be used as the basis from which the user explores alternative values.
  • Create a vector calculator. This will make a vector derived from the two points, which is also OK.
  • Create a translation layer, which will use the original image, and the calculated vector, to move the image to the required point.

So there are four layers required to achieve one effect, which is arguably the most natural way to specify a translation anyway. Today I created a simpler variant of translation, which simply takes a point as a parameter and moves to there.

However, this does seem to point to a more general issue. My geometric operators are all nicely based on properly defined mathematical transformations. As a side-effect, this made them really easy to implement - each one simply corresponds to one of the Java Graphics2D AffineTransform operations. But perhaps we should be suspicious when the abstraction needed to implement a system function is too convenient - it's a sign that we might be imposing the programmer's mental model on the user.

In fact, every one of the geometric transforms has turned out to be not so closely related to the user applications that I've found interesting for those transforms. In a "naive geometry" approach, they could be described as follows:

  • Translate -> "move it to here"
  • Rotate -> "make it spin round" (usually as an animation that doesn't stop)
  • Scale -> "stretch or squash" (not uniformly, but in various ways that drag handles can produce)

I suspect that I should discard the proper mathematical versions, replacing them with a move layer (which is what I had originally before making the more elegant mathematical transforms), a spin layer, and a layer that can reproduce any number of adjustments using the direct manipulation handles. This last will also overcome another problem. Although it was possible to generate a mathematical transform layer initialised according to handle manipulation, only the last manipulation was included - mainly because it would be so surprising to the user to see multiple layers of transform appear in response to a single button press.

Finally, a little reflection on "naive geometry". This takes me back to my MSc thesis, when I formulated a naive physics-style "qualitative trigonometry" that could be used for robust spatial reasoning by robots. Along with the rest of the naive physics / qualitative spatial reasoning movement in the mid 1980s, this could well have represented a user-orientation within the AI community, as we tried to create knowledge representations that were better aligned with common sense. At that time, the motivation was to replicate human problem-solving performance, rather than making computer "reasoning" easier for humans to understand, but the latter was  undoubtedly a side effect.

From a more HCI orientation, the Natural Programming project of Brad Myers and his students could be considered as an approach to defining a "naive computation" where program behaviour is described in common sense terms. I should probably have spotted this earlier, because I've been recommending it to Jamie Diprose, a student of Beryl's who is creating a visual programming language for healthcare robotics. I've encouraged Jamie to take an approach derived from John Pane's natural programming work, interviewing healthcare professionals to establish a vocabulary of domain concepts for use in his language. The healthcare robotics domain is sufficiently unlike the general purpose mechanical assembly domain of my own earlier work that I hadn't noticed the analogy to qualitative trigonometry, but now that I've noticed it, I could regard my project as creating a domain-specific language for exploratory image manipulation.

No comments:

Post a Comment