Kinetic Interfaces – Final Project (Francie)

In my final project I continue exploring the tile game that I have developed a basic prototype during the midterm (link to my midterm project documentation). Basically, the real-time image captured by webcam is divided evenly into 16 tiles, with one of them removed to be empty space. Players can move the tiles next to the empty space and exchange their positions thereby. The task of the game is to resume the disordered tiles into the original image.

In my final project, I made major improvements in the interaction of the game. After learning about the Kinect devices, I replaced the keyboard operations with kinetic body movements. Prof. Moon helped me create the swipe gesture based on the existing library. Basically, Kinect has a depth sensor and is good at detecting the closest point towards it. When we do swipe gestures, it is natural that we reach out our hands and arms in front of our body, which generates a closest point towards the Kinect device. As a result, the movements of the closest point are in sync with the swiping directions of our hands, which can be easily tracked by the changes of coordinates. In this way, the differences between the current frame and the previous frame will imply where the closest point is moving towards, and in other words they represent the movement of our swipe gestures.

When we were testing the swipe gesture, I came up with the concern of how to repeat the hand movements towards the same direction more than one time. To be more specific, if I want to move the tiles upward continuously, I will swipe my hand up, take it back, and swipe up again. However, the sensitive Kinect will record the tiny “take it back” and analyze it as a “swipe down” based on the coordinate value. Therefore, I need to set up a break between each movement, and let the computer know which one should be counted in. Prof. Moon quickly solved this problem by adding an interval between movements. The interval works like a timer, and it guarantees that the next movement will not be triggered until it counts from max to 0.

As for the randomization, I searched online and found an algorithm called Fisher Yates Shuffle. Here is the sample code. I did not understand how it implemented the randomization function. My friend Joe Shen helped me apply the structure to my tiles, and explained the basic knowledge to me. The two parts below designated the targets for the shuffle.

In order to start the shuffle, I also needed to put the following codes after the conditions. At first the shuffle would be generated at the beginning of the game, but later I decided to add more interaction to the game, and tried to shuffle the tiles when player stepped within a certain range of the depth. However, it was difficult to control the tracking of the closest point, and I then inserted the shuffle function to the swipe gesture. As long as the players kept moving upwards several times, the tiles would shuffle themselves randomly.

Here is the demo video that I took during the semester show.

My project still has a lot of potentials for development. As the guests mentioned in the final presentation, I could consider better user interfaces apart from the pure real-time images. It would be better if I could add animated effects to the movements of the tiles with lerp() functions. Besides, some of my classmates also pointed out the accuracy problem after they played the game, which required attention and major improvements.  With regard to the shuffle, I am thinking about a more logical movement to trigger it so that it fits more to our pre-existing habits and knowledge.

Olesia and Oriana, Final Project Documentation. “What if I get sucked up into space”

Link to the proposal:

For our final, my initial idea was to continue working on the midterm project that Ori and I developed. So, at first I thought I would work alone but then Ori decided to switch from her project and join me. This is how we ended up working together, again.


As it was stated in our slides, the idea is to simulate an experience in which the user gets sucked up into space. I got inspired by a famous scene from a movie.

Here is the moment that triggered my thinking:

I am not going to go into details regarding the ideation process and the details of how we planned on doing it as we talked extensively about it in the presentation as well as in person with the professor.


Oriana and I tried to break our work into three main components:

  1. aesthetics
  2. interaction with the Kinect
  3. adjusting the details

The aesthetics part was somewhat challenging as we tried to avoid the goofy, cartoony effect that our midterm sketch. So, we started out by working on the visuals side. First, we created a new class with the smallest stars you can see. Then, we set out to adjust the speed of the movement, multiply and randomize the number of stars and planets that were coming at the user. Later on, we added the sound I found on the NASA website which essentially is a bunch of radiowaves converted by NASA into an audible for humans format. The sound is mainly a bunch of vibrations that depending on the quality of the speakers of the headphones gets trippier and trippier. We realized that we probably must have used speakers but it was a bit too late to change the setup of the project at that point. Hence, for the final presentation, Ori and I used my headphones.

We used the podcast room to showcase our work. First, we put the bear on the floor to make the user more comfortable laying down. Then, we set up the Kinect and the projector behind the user on a bunch of chairs. This is how the user would experience the project: fully horizontal, in the darkness, wearing headphones.


As for the IMA show and the final presentation, we got extremely good feedback from the users. They liked the idea and the implementation. The best part, they said, was to get to control the amplitude of the sound with their hands. They also noted how trippy and immersive it was. Here is a bunch of videos of people experiencing our project:

IMG_8351 IMG_8353



We would really love to use as many projectors as possible to get the most immersive effect. We already tried using two projectors but due to the limited number of good projectors available at this school, we couldn’t reserve the right one. So, we went with just one projector. For the future, we were thinking of a) using more projectors and Kinects, b) improve the aesthetics a bit, c) replace the headphones with the high-quality, bass speakers. So that the minute user gets into the room, he starts tripping.


On our way to “success”, Ori and I didn’t encounter any major hindrances, however, we did experience some setbacks. First of all, we had some problems with our classes, they would not work properly with the main sketch. Second of all, our framerate was so slow at first that we could not even test whether or not our code was working at all. Finally, projection mapping did not always go the way it was suppsoed to. It took me a significant amount of time to figure out exactly how it worked so that we could successfully utilize it in our presentations.


Overall, we are very happy with the final result. Not only did we receive positive feedback from the faculty and IMA students but also from the “outside” people. We are really excited to have our project exhibited at the Century Mall, if possible. That would be amazing!


import processing.sound.*;
import codeanticode.syphon.*;
SoundFile file;

SyphonServer server;
import org.openkinect.processing.*;

Kinect2 kinect2;

float minThreshHead = 480;
float maxThreshHead = 1500;
PImage img; 
float avgX, avgY;

ArrayList<Star> stars = new ArrayList<Star>(); 
ArrayList<Planet> planets = new ArrayList<Planet>(); 
ArrayList<Little> littles = new ArrayList<Little>(); 
PImage pimg;

void setup() {
  size(1200, 900, P3D); // 16:9 4:3 
  background (0); 

  file = new SoundFile (this, "SPACE_S.wav"); 
  server = new SyphonServer(this, "Processing Syphon");; 
  kinect2 = new Kinect2(this); 

  img = createImage(kinect2.depthWidth, kinect2.depthHeight, RGB); 


  for (int i = 0; i<100; i++) {

    stars.add(new Star(random(-width/2, width/2), 
      random (-height/2, height/2), 
      random(-200, 200)));
  pimg = loadImage("data/planet.jpg");
  for (int i = 0; i<20; i++) {
    planets.add(new Planet(random(-150, 150), random(-100, 100), random(-1500, 500), pimg));
  for (int i = 0; i<800; i++) {
    littles.add(new Little(random(-width/2, width/2), random(-height/2, height/2), random(-500, -1000)));

void draw () {
  // background
  fill(0, 30);
  rect(0, 0, width, height ); 

  // 3D Space
  translate (width/2, height/2); 


  int [] depth = kinect2.getRawDepth(); 

  float sumX = 0; 
  float sumY = 0; 
  float totalPixels = 0;   

  int h = kinect2.depthHeight; 
  int w = kinect2.depthWidth;

  for (int y = 0; y < h; y++) {
    for (int x = 0; x < w; x++) {
      int index = x + y * w; 
      int d = depth[index];

      if (d > minThreshHead && d < maxThreshHead && x > 100) {
        sumX += x; 
        sumY += y; 
        img.pixels[index] = color(255, 0, 0);
      } else {
        img.pixels[index] = color(0, 0);


  if (totalPixels > 0) {
    avgX = sumX / totalPixels; 
    avgY = sumY / totalPixels;
  //color cp = get(30, 20); 

  float speed = 0.5;
  float accX = map(avgX, 0, img.width, -speed, speed);
  float accY = map(avgY, 0, img.height, -speed, speed);

  //println(accX + " " + accY);
  for (int i =0; i < planets.size(); i++) {
    Planet p = planets.get(i);

    //p.updateVelocity(accX, accY);
    //stars[].display(); //for array--the total number of the elements is fixed
    //if (red(cp) == 255 && green(cp) == 0 && blue(cp) == 0 ) { 

    p.display(); //for arrraylist. Number of elements in an array list is not fixed.
  for (int i = 0; i<stars.size(); i++) {
    Star s = stars.get(i);
    //stars[].display(); //for array--the total number of the elements is fixed
  for (int i = 0; i<littles.size(); i++) {
    Little l = littles.get(i);

    l.displayLittle(); //for arraylist. Number of elements in an array list is not fixed.



  // 2D Canvas

  image(img, 0, 0);

  //image(pimg, 0, 0);
  stroke(0, 255, 0);
  line(img.width/2, img.height/2, avgX, avgY);
  fill(0, 255, 0);
  ellipse(avgX, avgY, 10, 10);

  text(frameRate, 10, 20);


class Planet {
  PImage pimg;
  float x, y, z; 
  float rad1, rad2, rad3; 
  float velX, velY, velZ;

  //constructor function
  Planet (float _x, float _y, float _z, PImage _pimg) {
    x = _x; 
    y = _y; 
    z = _z; 
    pimg = _pimg;
    rad1 = random (25/1.5, 35/1.5); 
    rad2 = random (50/1.5, 60/1.5); 
    rad3 = random (20/1.5, 25/1.5);
    velX = 0; 
    velY = 0;
    velZ = random(3, 4);

  void display() {

    translate (x, y, z);
    //ellipse(0, 0, rad1*2.5, rad1*2.5);



  void updateVelocity(float vx, float vy) {
    // pos <- vel <- acc = force
    velX = -vx;
    velY = -vy;
  void move() {
    x += velX; 
    y += velY; 
    z += velZ;

  void restart() {

    if (z > 500) {
      x = 0;
      y = 0;
      z = -1500;
      println( "reset" );

class Little {

  float x, y, z; 
  float velX, velY, velZ; 
  float rad;  

  //constructor function
  Little (float _x, float _y, float _z) {
    x = _x; 
    y = _y; 
    z = _z; 
    rad = random (4, 7); 
    velX = 0; 
    velY = 0;
    velZ = random(1, 2);

  void displayLittle() {

    translate (x, y, z); 
    fill (255); 
    sphere(rad * 0.15); 

  void move() {
    x += velX; 
    y += velY; 
    z += velZ;

  void restart() {

    if (z > 1000)
      z = -500;

class Star {

  float x, y, z; 
  float rad; 
  float velX, velY, velZ; 

  //constructor function
  Star (float _x, float _y, float _z) {
    x = _x; 
    y = _y; 
    z = _z; 
    rad = random (4, 7); 
    velX = 0; 
    velY = 0;
    velZ = random(3, 4);

  void displayStar() {

    translate (x, y, z); 
    fill (219,237,255); 


  void move() {
    x += velX; 
    y += velY; 
    z += velZ;

  void restart() {

    if (z > 1000)
      z = -500;

Kinetic Interfaces – Week6 Assignment (Francie)

I create a small game that involves the grab gesture in leap motion. Basically, the leap motion captures the real-time movement of the user’s hand. When the hand grabs the bouncing ball inside the square, a lot of small balls will spread into all directions.

I first create a bouncing ball that moves inside a limited square. It will change the direction once it bumps into the boundaries.

Then I work on the small balls. I use ArrayList because the number of the balls is uncertain. Based on the previous exercises, those small balls will disappear when they go out of the screen.

In order to add more fun, I make the small balls change color if they go out of the designated boundaries. While they are inside the square, all the balls are filled in black. When they float out of the square, they will randomly change to different colors.

After I have prepared both the target ball and the small balls, I start to use LeapMotion library to combine them with grab gestures. It detects the grab gesture, and at the same time the distance between the middle finger and the target ball should be less than a certain distance. In this way, the program can more accurately decide whether the user has grabbed the target ball.

Below is the demo of this simple game:

Kinetic Interfaces – Week10 Assignment (Francie)

In this week’s assignment, I practice with ControlP5 GUI library and get a chance to explore various GUI designs. I use five different GUIs to adjust values and thus generate customized effects on the Kinect point cloud.

First of all, I create a toggle to switch between two different modes. Originally, I wanted to control the particles that would shape into the point cloud, say, sphere or cube particles. But I got into a few issues with the points and then I decided to use the toggle to switch between grey mode and color mode.

Under the default grey mode, there is a range slider that allows users to adjust the depth range. The images are shown in grey tones based on their distance towards the Kinect. The left side of the bar determines the minimum depth value, while the right side represents the maximum.

If you click the toggle and switch to the color mode, the range slider will disappear, and instead a group of RGBA color bars will show up. By sliding on the color bars, users can change the color of the image.

Here is the code that shows how to switch between grey mode and color mode. As you can see, the grey mode shows with point cloud, while the color mode uses image. It seems that they are two quite different methods and sometimes they run into conflict.

One of the common buttons under both modes is a slider that can change the resolution, which reflects the conflict when I switch between the two modes. If I change the resolution under the grey mode, the image will become still and maintain on the screen. If I do not change the resolution, I can smoothly use the color mode. Meanwhile, such a problem does not happen to the color mode, and I can freely change the resolution of the color images.

The final button is used to save screenshots to files. I find the function saveFrame() quite useful and efficient.

Kinetic Interfaces – Intel RealSense for Web w/ Node.JS, Midterm (Kevin Li)

Realsense for Web and p5.js



I wanted to explore Depth Sensing cameras and technology. I knew that regular cameras were able to output RGB images and we can use openCV or machine learning to process the image to learn basic features of a scene. For example, we can do blob or contour detection which allows us to do hand or body tracking. We could also do color tracking, optical flow, background subtraction in openCV. We could apply machine learning models to do face detection, tracking and recognition. Recently, pose or skeletal estimation has also been made possible (posenet, openpose).

However, even with openCV and ML, grasping dimensions (or depth) is a very difficult problem. This is where depth sensors come in.


Pre-Work Resources

I gathered a variety of sources and material to read before I began to do experiments. The links below constitute some of my research and are also some of the more interesting readings I thought were very relevant.

Articles on pose estimation or detection using ML:

On Kinect: (great in-depth on how Kinect works)

On Depth Sensing vs Machine Learning: (great article!)

On Stereo Vision Depth Sensing:

Building and calibrating a stereo camera with OpenCV (<50€)

On Using Intel Realsense API and more CV resources:


Research Summary

I learned through while cutting edge machine learning models can provide near real-time pose estimation, models are typically only trained to do the thing they are good at, such as detecting a pose. Furthermore, a big problem is energy consumption and the requirement of fast graphics processing units.

Quality depth information, however, is much more raw in nature, and can be make background removal, blob detection, point cloud visualization, scene measurement and reconstruction, as well as many other tasks more easy and fun.

They do this by adding a new channel of information, a depth (D), for every pixel which comprises of a depth map.

There are many different types of depth sensors such as structured light, infrared stereo vision, time-of-flight sensors and this article gives a really well written overview of all of them.

All in all, each has specific advantages and drawbacks. Through this article, I knew the Kinect used structured light ( and I generally knew how the Kinect worked as well as its depth quality from having done previous experiments with it. I wanted to explore a new much smaller (runs on USB), depth sensor that uses a method known as infrared stereo-vision (which is inspired by our human vision system) to derive a depth map. It relies on two cameras and calculates depth by estimating disparities between matching key-points in the left and right images.

I knew the Realsense library had an open source SDK (, however it is written in C++ which means its not the easiest to get started with, to compile, and to document. But recently, they’ve released a NodeJS wrapper which I hope to use to make things easier for me. One of my goals is to figure out how to use the library but also see if I could make it easier to get started with a more familiar drawing library that we know or use.



Hour 1 – 4: Getting Intel RealSense Set Up, Downloading and Installing RealSense Viewer, Installing Node-LibRealSense library bindings and C++ LibRealSense SDK, Playing Around With Different Configuration Settings in Viewer

Hour 5: Opening and Running Node Example Code, I see a somewhat complicated example of being able to send RealSense data through websockets, seems promising and I want to try to build my own.

Hour 6: Looking at different frontend libraries or frameworks (React, Electron) before deciding to just plunge in and write some code.

I’m able to open a context, look through available devices and sensors, get a specific sensor based on the name either “Stereo Module” or “RGB Camera”. Then I can get all the stream profiles for that sensor (there are a lot of profiles, depending on fps, resolution, and type — infrared or depth), but the most basic one that I want is a depth stream of resolution 1280*720 and at 30fps.

I can open a connection to the sensor by .open() which opens the subdevice for exclusive access.

Hour 7: Lots of progress! I canstart capturing frames by calling .start() and providing a callback function. A DepthFrame object is passed to this callback every frame which consists of the depth data, a timestamp, and a frame count number. I can then use the Colorizer class that comes with the Node RealSense library to visualize the depth data by tranforming the data into RGB8 format. This has a problem though, as the depth data is 1280*720 = 921600. However, this data is then stored as RGB8 which is 921600 * 3 or 2764800 or 2.76 MB. At 30 frames per second, this would be equivalent to nearly 83MB of data / second! Probably way too much for streaming anything between applications. We can compress this using a fast image compression library called Sharp. We can get quite good results which this. Setting our image quality to 10, we get 23kb per frame or 690 kb / s. Setting our image quality to 25, gets us 49kb per frame or 1.5MB a second (which is quite reasonable). Even at image quality 50, which is 76kb per frame, we can average 2.2MB / s. From this, I estimate it is quite reasonable to stream the depth data within local applications and has potential to even stream over the Internet. I might try that next.

Current Backend Diagram

[Insert Image]

Hour 8 – 9: More progress. I got stuck in a few spots that were sticky situations but ended up getting through it and now we have a quick and dirty working implementation of RealSense through WebSockets. I connected it through using the WebSockets Node library (ws).

Challenges Here + Blob

Hour 8 (backend w/ websockets):

Hour 9 (connecting w/ p5.js):

The video below show the color depth image being processed directly in p5.js using a very simple color matching algorithm (showing blue pixels) which gives us a rudimentary depth thresholding technique (this would be much easier to do directly in Node since we have the exact depth values, which we will try later). The second video shows another rudimentary technique of averaging the blue color pixels to get the center mass of the person in the video. Of course, this is all made much easier since we have the depth color image from RealSense. This would not be very possible, at least in a frontend library like p5.js, without sending depth camera information across the network since most computer vision libraries still exist only for backend. This introduces some new possibilities for creative projects using depth camera + frontend creative coding libraries, especially since the javascript is so adept at real-time networking and interfacing.

Before moving on to play with more examples of depth, specifically, point clouds, depth thresholding, and integrating openCV to do more fun stuff, and figuring out the best way to interface with this, I want to see if I can get the depth data sent through to Processing as well.

Hour 10 (Processing): I spent this hour researching what ways we can send blob (binary large object blobs) over network and settled on Websockets or OSC and if Java can actually decipher blob objects. I decided to move on instead of keep working on this part.

Hour 11 and Hour 12

I was met with a few challenges. One of the main challenges was struggling with async vs syncronous frame-polling. I did not know that the NodeJS wrapper had two different types of calls to poll for frames. One of which was an synchronous thread-blocking version — pipeline.waitForFrames() and the other which was a asynchronous — pipeline.pollForFrames(). In any case, the second async version is what we want but we would need to implement an event loop timer (setInterval or preferably something better like a draw() function) that can call pollForFrames every 30 seconds.

Hour 13 and Hour 14

Streaming raw depth data, receiving in p5 and processing image texture as points to render point cloud, lots of problems here

I wanted to stream the raw depth data because as of now, I have color depth data but having to do post-processing on it is a pain. I wanted to send the depth data as an image with depth values encoded between 0 – 255 as a grayscale image or as raw values which we can convert with a depth scale. This would be similar to how the Kinect operates.

I thought it would be pretty simple, just get the depth scale of the sensor and multiply by each raw depth data at each pixel and write back to the data buffer.

However, I was stuck for quite a long time because I was sending nonsensical values over websockets which resulted in a very weird visualization in p5. I wish I took a video of this. Anyways, I believe I was doing something wrong with the depth scale and my understanding of how the raw values worked. I decided to go to sleep and think about it the next day.

Hour 15 and 16

When I woke up, I realized something simple that I overlooked. I realized that I did not need to convert the raw values as I remembered the Realsense Viewer had an option to view the depth data as a white-to-black color scheme. I realized I could toggle the color scheme configuration when I was calling the colorizer() function to convert the depth map to an RGB image.

If I did: this.colorizer.setOption(rs.option.OPTION_COLOR_SCHEME, 2)

I could set the color-scheme to a already-mapped 0-255 grayscale image. Then, it would be the same process as sending the color image over. The simplicity of this approach was unbelievable as right after this realization and implementation, I tried it, and the results were immediate. I was able to send the depth image over p5 and I could p5 to sample the depth image to render a point cloud (see below video)

Hour 17 and 18

I was able to link multiple computers to receive the depth information being broadcast from the central server (laptop). I also took a demo video in a taxi as I realized that I was not limited in using the Realsense since it was portable and powered off USB. I simply ran the p5 sketch and the local server and it works!

Taxi Test Video:


Hour 19 and 20

I used the remaining time to work on a quick presentation on my findings and work.

Conclusions and Future Work

After doing this exploration, I think there is potential in further developing this into a more fully-fledged “Web Realsense” library similar to how the Kinectron ( works for the Kinect.

Midterm Project – A Galaxy of Code; De Angelis (Moon)

A Galaxy of Code 

For our midterm project, Olesia Ermilova and I tried to recreate a galaxy on processing. Our project was mainly inspired by one of Olesia’s animation projects, in which she designed a space adventure made up of black and white planets and stars. Below you can see a video of her original project.

Thereafter, we decided to make a similar version of this project, yet with high levels of interaction. Our initial idea for interactivity involved face recognition. We wanted our code to detect the user’s face, and according to the position of the user on the screen, our galaxy would rotate accordingly and follow the user through her or his movement.

The interaction was inspired by a commercial game called Loner. The main idea of the game is to use your body to navigate through the game’s interface. Here is a video, for a more thorough understanding.

In short, our main idea was to combine the interaction of Loner with Olesia’s galaxy project. In this way, we would improve the visuals of an already successful game, by adding a more complex and aesthetic setting to the world through which the user has to navigate. Our game would bring tranquility to the user, with the easy interaction and galaxy setting. In part, also inspired by our childhood, when our mothers would stick glow-in-the-dark stars on our ceiling.

Unfortunately, our interaction did not go as planned. As I mentioned before, in order to create an in-depth feel to our visuals, we needed to change our render from 2D to 3D. This would allow us to not only toy with the x and y-scale of our code but with the z location of our objects as well. The problem lied around the fact that 3D graphics are not compatible with processing’s Camera library, hence interaction through face recognition was not an option.

Here is a video of our final project, in which you can see the final visuals and how we used the z-scale to create a feeling of depth in our galaxy.

To create our final project, we toyed around with several of the processing skills we learned throughout the semester so far. For starters, we created a class for each object, which helped us organize our code.

In said classes, we defined the functions we would want each object to be subject to. For instance, we had our restart () function in all sections, which allowed for all objects to be recreated once they were offScreen. Meaning, as soon as the objects reached their maximum z-value, they would once again appear at the back of our galaxy, constantly recreating themselves.

We played around with the planet,s stars, and moons, speed, in order to move the items around instead of having the stay intact in one place.

Our planet function, for instance, was as follows:

Planet (float _x, float _y, float _z) {
x = _x;
y = _y;
z = _z;
rad1 = random (15, 25);
rad2 = random (40, 50);
rad3 = random (10, 15);
velX = 0;
velY = 0;
velZ = random(0.5, 1.0);


and the characteristics of each planet were the following:

void display(float offsetX, float offsetY) {

translate (x + offsetX , y + offsetY, z);
fill (255);
ellipse(0, 0, rad1*2.5, rad1*2.5);
ellipse(0, 0, rad2*2, rad3*2);


In order to move the planets around the screen, we created our void fast() function:

void fast() {
x += velX * 10;
y += velY * 10;
z += velZ * 10;


void restart() {

if (z > 1000)
z = -1000;


and finally, our restart which was previously explained.

The moon and stars classes had similar attributes, find our complete code here.

Now on to interaction. For our interaction aspect, we decided to switch from Face Recognition to Leap Motion. We used PRocessing’s leap motion library to call back on finger positions, thus switching our planet’s x and y locations to each finger’s location. Therefore, a planet would be created according to each finger’s position on the screen.

According to the feedback we received from the panel, this interaction could be significantly improved. They did not appreciate the fact that the user would always have 5 planets, given it took away from the galaxy’s reality. Five planets was an awkward number of planets to have, and the fact that they all remained together, reduced the realistic feeling of our setting.

We also included hand grab gestures into our code. Whenever the user would close their fists, the velocity of all our objects would multiply by 10, giving the user the feeling that they were navigating through space and could alter the navigation to their preference.

In the future, we would like to modify our project and turn it into a more immersive experience. This meaning, we would like to discard the Leap Motion functions and use software that would receive input from a user’s entire body, such as the Microsoft Kinect. Perhaps we will develop this idea further for our final project.

Kinetic Interfaces – Midterm Project (Francie)

Among the assignments in the first half of the semester, I am interested in exploring the webcam and playing with pixel manipulation. Inspired by the built-in widget in my Macbook, I came up with the idea of combining pixel manipulation with the tile game. Basically, a flat image is evenly divided into 16 tiles, and one of them is replaced by an empty position. Each time the tiles next to the empty space can be moved into it, and we can organize the disordered tiles in this way. In my project, I want to replace the still image with a real-time video shot by the webcam.


I started with capturing the video images from webcam and tried to divide the pixels into tiles. Later it turned out that this step required a large amount of mathematic calculation. Let’s use the top left tile as an example. First of all, I need to give a range for both x and y values, and here the coordinates are limited to 0 < x < width/4 and 0 < y < height/4. Then I take out the pixels from the original webcam images using the function index = x + y * width. Now I have picked out the index of the pixels that belong to the first tile of the total sixteen. Next, I am going to display these pixels in a new canvas with a 1/4 width and 1/4 height. By transforming the function into _index = x + y * width/4, I can cut out the 1/16 images of the video and put the pixels into a container called img11.

After I figure out the first tile, I then derive the formulas of other parts from it. The values of x and y in the remaining need to restore by deducting their displacements from the top left. Here are how to get the coordinates of the second horizontal line. The y values in the formula become y – height/4, while the x values are individually adjusted to x – width/4, x – width/2, and x – width*3/4 according to their positions.

Another challenge in this project is to switch the positions of the tiles. I fill in the bottom right tile with white color instead of updating the pixels from the video so that it looks like an empty space. To simplify the interaction, I use the keyboard to control the movement. The “switch case” function helps me to achieve the separate instructions while pressing different direction keys. However, I was stuck in this step for a long time because the printed coordinates were correct, but the images did not change at all. Here I must credit to my friend Joe who helped me figure out the solution. On the one side, it is significant to pay attention to the sequence of codes. The values assigned to the variables were changing, and I must be clear about what the variables in each line meant.

On the other side, the declaration of array also confuses me a lot. My friend used the below method to finally make the tiles movable, but I still do not understand why it is necessary to do the extra assignment.

And there is one more mysterious operation. In the beginning I created the images line by line, and it looked a little wordy. So I used a for-loop to generate the images in order to make the codes clearer. But somehow it did not work anymore. I feel that the two ways are the same, but actually only the complicated one is valid. Why is it problematic to use for-loop here?

The demo of the tile game comes below:

I really love this project and I want to add more creativity into this classic game. Currently the tiles do not appear randomly and I have to disorder them manually before I start to play. So it would be better if I could randomize the tiles. I also plan to improve the interaction by using leap motion or kinetic rather than keyboards. Moreover, I am willing to add 3D perspectives and projection mapping for a more attractive user experience.

Kinetic Interfaces (Midterm): A Band – Sherry

Partner: Peter

Ideation:  Peter and I agreed on doing an instrument simulation using Leap Motion after a brief discussion, but later I found this fun project in Chrome music lab and came up with the idea of a choir conductor with Leap Motion. When meeting with Moon, he suggested that we combine two ideas together with the help of oscP5 library, so we ended up creating a band in which a conductor controls several musical instruments, each conductor/instrument running on a processing sketch with a Leap Motion.

Implementation: Due to time constraints, we only had one kind of instrument — guitar communicate with the conductor. Peter was in charge of the implementation of conductor, and I worked on the guitar. We both did some research and got a basic understanding of how oscP5 works.

I got the inspiration from Garageband and tried to draw a horizontal guitar interface as shown in the picture above, but during testing I found that the distance between strings were too close and the accuracy was very low. It was difficult to pluck certain string(s). Therefore I decided to switch to a vertical interface:

Since Leap Motion has a wide detection range (is more sensitive) on x-axis, the accuracy becomes higher and user experience is better. Strings have different stroke weights to imitate the real guitar.

Above was my first version of string triggering code. However, under this circumstance, messages will continuously be printed out (and sound file will be played again and again) if my finger stays in the range, which wouldn’t happen if we play a real guitar. To solve this problem, I changed the algorithm of determining string plucking.

The main idea is to figure out which strings are in between previous and current finger positions and play corresponding sounds in order. I denote six strings as index 0 to 5, then floats “start” and “end” are the corresponding indices of previous and current finger positions. If start is greater than end, it means the user swiped to the left, and strings in between are triggered from right to left, and vice versa. With this algorithm, holding the finger on a string doesn’t trigger the sound, making the experience more realistic.

I introduced z-index of Leap Motion to simulate the effect of user’s finger on/above the string. When z is less than the threshold (user’s finger moving closer to his/her body), the dot that shows the position of user’s index finger is white, indicating it’s above the string and won’t trigger any sound as it moves. When z is greater that the threshold, the dot becomes red and sounds will be triggered.

String vibration effect and sine waves were added to enhance the visual experience.

Video demo of the leap motion guitar:


Then I started to study oscP5 library. Thanks to “oscP5sendreceive” example by andreas schlegel and noguchi’s example, I created a very basic data communication program. mouseX and mouseY in osc1 (the left sketch) were sent to osc2 (the right sketch) and used to draw an ellipse on canvas.

Later I met Peter and he used oscP5tcp example for the conductor, so we decided to use tcp for both. Initially we planed to pass three parameters: volume modifier, frequency modifier and mute boolean value, but we met two problems. Because of the limitation of minim library, we couldn’t change the volume and frequency of a sound file. After several trials we managed to modify the volume using setGain() instead of setVolume(), but unfortunately we could do nothing about the frequency.

Final demo:

Index finger: swipes horizontally above the Leap Motion to pluck the strings
moves back towards body to move away from the strings
Dot on the screen: red – available, will trigger sounds if moving across strings
white – unavailable, either muted by conductor or too close to the body

Hand: moves up and down in certain instrument’s part of screen to increase/decrease the volume
grab to mute certain instrument

Feedback: Professor Chen brought up the “why” question, which I think is quite important and deserves further reflection. I agree with her that the idea that the conductor can actually have control over other users is great, and I can’t really answer the question why we need to have a leap motion simulating real instrument while the experience of playing physical instruments is good enough. I’m thinking of keeping the tech part but wrapping it with a different idea that is more interesting or more meaningful (though I have no idea for now).

Kinetic Interfaces: Midterm – Collaborative Musical System (Peter)

Project Idea

The idea of this project is to create a collaborative music system with kinetic interfaces. The system consists of a conductor, several instruments, and leap motions as inputs, which are connected over the internet. To be more specific, the conductor monitors the music data from the instrument and controls the musical features of the instruments. For each musical instrument, the player controls it on a digital instrumental interface with a leap motion. In general, the project aims at creating a system by which users can create music cooperatively and explores how collaborative computing can do for musical or even artistic creation.


The conductor and the virtual instruments are connected over the internet. To achieve this, we utilized a processing library called oscP5. By experimenting multiple example connection methods provided by the library, we decided that the oscP5’s TCP connection is the most suitable for this project. Using the TCP connection methods, we are able to make the server and the clients to respond in real time. More specifically, the clients send pings repeatedly to the server (every 0.5 seconds), and the server acknowledges the pings and responses with the data of musical features to the clients. Meanwhile, this way of connection also allows the system to identify the existence of the clients. If a node stopped to ping the server for a certain long enough time, the server would delete it from its list, and modify the interface accordingly as well.

Instrument (Client)

The instruments have two tasks. First, it provides the players with a digital interface of the instruments by which the players can play the instruments with a leap motion. Second, the players send pings to the server. The pings include the information of the name of the instruments such that the server can identify whether this is a new instrument that just came online, or it is an old one. This means that as long as there is a server online, the clients are free to come and go, thus enhancing the scalability of the system.

Conductor (Server)

For now, the conductor has three functions. Firstly, it controls the instruments that are connected to them by sending them data of musical features. Currently, the musical features only contain the volume, but we are going to explore more on additional features that make sense to play with. Secondly, the conductor is aware of the size and the identity of the whole network. This has two advantages. First, we will be able to design each difference instrument’s feature and send them to the correct instrument. Second, the conductor’s interface can thus be dynamic. For instance, when there are two instruments connected, the interface will be split into two, and more there are more instruments connected. And if one or more instrument leaves the system, the interface can be changed accordingly.


This project is meant to be developed for the whole semester, and this midterm project is the first phase of it. For the rest of the semester, we are expecting the following improvements:

  1.    Create additional interfaces for additional musical instruments.
  2. Explore the musical features that make sense to alter into more detail.
  3. Consider using Microsoft Kinect as the user input device for the conductor.
  4. Improve and beautify the interfaces for both the instruments and the conductor.
  5. We might try to extend the system out of LAN to achieve better flexibility and scalability.

Video Demo

Week 8: Midterm

Proposal: For my midterm project, I wanted to explore several of the themes in Sophocles’ tragedy, Philoctetes. Philoctetes is a play that tells the story of Philoctetes, a famed Greek archer en route to fight in the Trojan War. On the way to Troy, Philoctetes is bitten by a cursed snake. The bite can never heal, and Philoctetes’ leg wound puts him in unbearable agony. He is is constant pain, screaming and crying out; Odysseus and the other soldiers can’t take the sound of his pain, and so they abandon Philoctetes on the island of Lemnos. Philoctetes’ exile lasts ten years, in which he is cast out from society and no one comes to his aid. When he is eventually “rescued”, Philoctetes realizes that many of his friends have passed away. Central to this legend is the relationship between society and individual pain; society only has tolerance for a certain level of expressed emotional pain, and does not accommodate anything more than that.

I have done Philoctetes-related projects in several of my other classes; I wanted to build off of my interactive chair project in Exhibition: Next for Kinetic Interfaces. In Exhibition: Next, I placed a chair in a dark room with a pair of headphones, and had everyone sitting in the chair put on the headphones. They would then hear an audio recording of someone sobbing. It was really interesting to watch peoples’ reactions to the project – some sat and listened for several minutes, others ripped the headphones off immediately. I wanted to explore the discomfort of witnessing someone else’s pain. The idea I had in mind involved using facial recognition to trigger audio of someone crying, and change an image onscreen.

Documentation: While I originally tried to use FaceOSC for this project, I switched to OpenCV and used the Minim library to activate this audio clip whenever a face was detected on the webcam, and pause it whenever someone looked away. I also sketched the following two images in Krita; somewhat counter-intuitively, it was the first image that was displayed when the audio played, and the second that displayed when it didn’t (one would expect someone to be crying into their arms, and not making eye contact, when sobbing.)

I wanted to try several different things for this project; for example, I wanted to experiment with soundGain so that when multiple faces were detective the volume of the sobbing would seem to be louder. But I did not have time to implement this function of interactivity. I received a lot of useful feedback during presentations, however; people suggested I film a video of someone crying instead, that I change the size of the display, and that I consider how I would install this prototype as a larger, more complete project.

Installation: I think the key for an installation like this would be putting it in a place where people do not expect it, and feel a jump of discomfort and uncertainty when they trigger the interaction; seeing someone cry openly on the subway, for example, one is not sure where to look.

While I am not sure how feasible it is in this class, I thought that the overlap between facial recognition and projection mapping would be a really interesting way to simulate a human interaction and emphasize a sense of discomfort. Could I do projection map on some sort of vaguely human-like mannequin? I could set up the installation a variety of different ways, so that someone can stumble across this human-like mannequin and, when they look at it, the mannequin will start to cry; when they look away, it stops. I think this would be a really interesting scenario to play with discomfort, but it might also shift the meaning/emphasis of my piece in another direction.

Another idea would be projection mapping a silhouette similar to the midterm on the walls of a small room. Once someone walks into the room, they will have the option of looking at several other pieces; the moment they notice the silhouette, though, and the facial recognition is recognized, the silhouette will start crying and trying to make eye contact. If the other person looks away and tries to ignore the silhouette, it will move someone closer in proximity to the person visiting the room, so that they have fewer options to avoid looking at someone in pain.