After finishing the first iteration of our software prototype we had a user testing session to find out what the users liked or disliked. This is needed to find any basic design flaws early on and make changes when they are still feasible or even possible.
User test setup
In order to conduct the user tests we needed a large screen to have a realistic test environment. Fortunately we were allowed to use a conference room of one of our employees. So the test setup involves a smart board as a display, the Kinect and a laptop to run the software prototype on. The user was given a printed out version of the task description.
The test users
Usually we would choose people from our target group as test users. Unfortunately we were not able to find anyone from our target group who agreed on conducting the user test, therefore we asked some of our friends and two of them, Sascha and Leon, agreed to help us out. Sascha is a 33 year old computer scientist who had no prior experience with the Kinect. Leon is an executive consultant who is 32 years old. He has had some prior experience with the Kinect.
The task that the test users had to fulfill was to create a new investigation case. Two new entities had to be added: a suspect and a piece of evidence. The entities had to be assigned some meaningful properties, e.g. age or place of finding.
Findings & Possible Solutions
Sascha tried to use hand gestures right away, but had a hard time with it because his movements were to hectic and he could not hit the buttons. We think these problem are due to the inaccuracy of the Kinect hardware. Since this is a system immanent problem there is nothing we can do about it, except for using the second generation of the Kinect hardware.
Since he used the hand gestures right away he totally forgot about using voice commands. Later on he confessed that he actually forgot about how to activate the speech recognition. This could be solved by displaying a hint that tells the user how to interact with the software.
When he finally figured out how to use voice commands he had problems finding out the right speech commands. He said that he could not read the speech commands properly. The most obvious solution would be to increase the size of the font that is used to render the speech commands. The problem with this solution would be that the voice recognition pop-up would take up more space. Another possible solution would be to not have specific voice commands that the user has to learn by heart but rather semantically interpret what the user said and invoke a fitting command. But that would be very hard to implement.
After a short time the user adapted to the speech recognition and learned the various voice commands that are possible. Once he knew all the speech commands he was annoyed by the pop-up that displays a list of all possible voice commands. He told us that he would rather not display all the voice commands once the speech recognition is enabled, but have a help function that displays them. For more advanced users this would be the perfect solution. For novice users on the other hand it would be very annoying to ask for help all the time.
Further more Sascha thought that it was unclear when the software was listening for speech commands and when not. Right now a pop-up is displayed as long as the speech recognition is activated. So a possible solution would be to add a label or a pulsating icon that signals that the software is listening.
After a while the user stated that he thought that the keyword „CRIME“ was to short and that it might not be conclusive to the software whether the keyword was said to activating the speech recognition or just in the context of a conversation. A solution to this would be to choose another word or to add another word like Google did with their virtual reality glasses: „Okay, Glass“. But in our opinion this is not needed because the application does not pick up the word when it is said in a sentence, but only if it stands alone. Therefore we think no action is needed.
All in all the user completed the task successfully and told us that software was easy to use and intuitive despite the mentioned flaws.
Right after the introduction to the software he began to use the software. He used the gestures very intuitively despite the fact that he had little prior experience with the Kinect. Even though he navigated through the application intuitively he had problems performing the pressing gesture. The test supervisor could observe that the user had stretched out his arm entirely which resulted in having to bend over in order to perform a pressing gesture. One solution to this problem could be to replace the pressing gesture with another gesture that requires less movement. But we think that the pressing gesture is the most intuitive one because it resembles the actual movement that a person is performing when pressing a physical button. In our opinion these difficulties can be blamed on the little experience that the user had and that the problem will solve itself once the user gets used to it.
Since the software picks up both hands of the user, Leon tried to interact with both hands at the same time, which resulted in confusion because the software did not actually allow this. The solution could be to allow multiple actions at once. But we think this would pose more problems than it would solve since the causality might be lost (e.g. the delete button is pressed with one hand before the other hand was able to select the entity that is to be deleted). Therefore we decided to only display one hand at a time as long as the user is not on a view that contains a control that allows interaction with multiple hands.
When adding properties to the suspect the user wanted to create an alibi and did not quite know which of the property types are most fitting for an alibi. Therefore he suggested to integrate more types into the application. This would be a possible solution. On the other hand we might end up integrating dozens of different types which would make it even more confusing for the user to choose one. Our reasoning behind adding the existing property types was that they would provide some semantics for the software so it would be easier to interpret the data (e.g. showing entities in the timeline or on the map). Therefore we only added the types that are needed by the application to interpret the data correctly. From this standpoint it would not make any sense to add more types. Also there is a generic property that can contain any type of information.
The user complained that when creating a new property the name has to be entered before defining the property type. He thought this to be very unintuitive. The solution to this is obvious: the steps of defining the type and entering the name have to be switched.
When trying to delete a property, Leon wanted to grab and drag the property to the delete button (a gesture that is well-known from applications like Photoshop). In his opinion it is too difficult to have to the navigate to the property first. The problem of the solution he proposed (dragging and dropping the property to some kind of trash bin) cannot be implemented because the properties are displayed in a list and the grabbing gesture is therefore already used to scroll through the list.
While creating the alibi the user wanted to create a property that represents a time span rather than just a point in time. The solution would be to simply add another property type for time spans.
After the user had finished his task he played around with the application and tested some other functionality. In his opinion the map can be used very intuitively. The only problem that he had was that when the hand is out of the screen space the grabbing gesture is not recognized. The problem arises due to the fact that the user cannot see what the camera sees. A possible solution would be to display the depth or the color image of the camera to show the user where the camera sees his hand. This feature is actually already implemented. When the application is run in debug mode the depth image of the camera is displayed to the user. We thought it to be helpful while debugging but of no import to the user and therefore we removed it from the release version.
Leon also tested the speech recognition thoroughly. At first he tried to use intuitive speech commands, which obviously did not work. A proposal for solving this problem has already been presented in the evaluation of Sascha. Also he did not understand what the „<>“ notation meant. It actually means that the user has to fill in the name of the object he wants to perform an action on. This problem of understanding could be solved by displaying the actual options, for example with an animation that cycles through all options. He had the feeling that he had to say „CRIME“ all the time because the speech recognition deactivates itself when the user has not interacted with it for 10 seconds. The obvious solution would be to increase this time to 15 seconds or even more.
Just like Sascha he thought that the word „CRIME“ was not fitting for the keyword to activate the speech recognition because it might be used in conversations. The reasons why this is not a problem were already discussed above.
Also Leon was confused by the structuring of the data in the application (cases are made up from entities and entities are made up from properties). For example he could not think of a reason why locations are no entities themselves instead of just properties. The reasoning behind this was already explained in the last blog post.
All in all he got used to the application really quick and could solve his task in a short amount of time.
Changes to the prototype
In order to make it more clear to the user how to interact with the application using gestures and speech commands, we integrated some small hints that are displayed once the application starts. These are only shown once right after the start of the application and are aborted once the user starts to interact with the application. In our opinion this is the perfect compromise between helping novice users and not getting on the nerves of experienced users. This was made to address the problem that some users had to find out how to interact with the application.
At first we wanted to let users dictate text to the application as an option for text input. It turns out that it does not work very well, therefore we had to come up with a better solution. Our approach to this problem is a companion app that can be run on a tablet. Every time the user encounters a view where he has to input text the app enables him to type it in. This even goes as far as having different ways of entering information for different data types, e.g. when the user has to type in a date he presented with a date picker. Also the application enables users to upload pictures from the tablet to the application. The app and C.R.I.M.E are synchronized in real-time so that the user is not disrupted.
Now all entities that contain location properties are displayed on the map. The entities are displayed as pins that display some information about the entity. When clicking the pin the user is navigated to the entity view. There is also a view that contains all pictures of all entities of the current case. When pressing a pictures the user is also navigated to the entity view.
The test users complained that the time before the speech recognition gets deactivated is too short. This problem was solved by increasing the time from 10 to 15 seconds.
Also some bugs were fixed. For example when an entity or a property was renamed and the user navigated back to the entity/property view, the names had not updated. These bugs are now fixed.
One of the test users wanted to interact with the application with both hands even if that was not possible in the context. Therefore only one hand is now displayed. When the user navigates to a view that contains a map then the second hand is activated in order to be able to perform the pinch-to-zoom gesture on the map.
Finally when a new entity or property is created then the user has to first select the type. This change was made due to the problem that one of our test users had.
- Applying changes to the prototype
- Blog post
- Applying changes to the prototype
- Conducting user tests
- Evaluating the user tests
- Evaluating the user tests