This video shows how the C.R.I.M.E software might be used in a real-world scenario.
This video shows how the C.R.I.M.E software might be used in a real-world scenario.
After finishing the first iteration of our software prototype we had a user testing session to find out what the users liked or disliked. This is needed to find any basic design flaws early on and make changes when they are still feasible or even possible.
In order to conduct the user tests we needed a large screen to have a realistic test environment. Fortunately we were allowed to use a conference room of one of our employees. So the test setup involves a smart board as a display, the Kinect and a laptop to run the software prototype on. The user was given a printed out version of the task description.
Usually we would choose people from our target group as test users. Unfortunately we were not able to find anyone from our target group who agreed on conducting the user test, therefore we asked some of our friends and two of them, Sascha and Leon, agreed to help us out. Sascha is a 33 year old computer scientist who had no prior experience with the Kinect. Leon is an executive consultant who is 32 years old. He has had some prior experience with the Kinect.
The task that the test users had to fulfill was to create a new investigation case. Two new entities had to be added: a suspect and a piece of evidence. The entities had to be assigned some meaningful properties, e.g. age or place of finding.
Sascha tried to use hand gestures right away, but had a hard time with it because his movements were to hectic and he could not hit the buttons. We think these problem are due to the inaccuracy of the Kinect hardware. Since this is a system immanent problem there is nothing we can do about it, except for using the second generation of the Kinect hardware.
Since he used the hand gestures right away he totally forgot about using voice commands. Later on he confessed that he actually forgot about how to activate the speech recognition. This could be solved by displaying a hint that tells the user how to interact with the software.
When he finally figured out how to use voice commands he had problems finding out the right speech commands. He said that he could not read the speech commands properly. The most obvious solution would be to increase the size of the font that is used to render the speech commands. The problem with this solution would be that the voice recognition pop-up would take up more space. Another possible solution would be to not have specific voice commands that the user has to learn by heart but rather semantically interpret what the user said and invoke a fitting command. But that would be very hard to implement.
After a short time the user adapted to the speech recognition and learned the various voice commands that are possible. Once he knew all the speech commands he was annoyed by the pop-up that displays a list of all possible voice commands. He told us that he would rather not display all the voice commands once the speech recognition is enabled, but have a help function that displays them. For more advanced users this would be the perfect solution. For novice users on the other hand it would be very annoying to ask for help all the time.
Further more Sascha thought that it was unclear when the software was listening for speech commands and when not. Right now a pop-up is displayed as long as the speech recognition is activated. So a possible solution would be to add a label or a pulsating icon that signals that the software is listening.
After a while the user stated that he thought that the keyword „CRIME“ was to short and that it might not be conclusive to the software whether the keyword was said to activating the speech recognition or just in the context of a conversation. A solution to this would be to choose another word or to add another word like Google did with their virtual reality glasses: „Okay, Glass“. But in our opinion this is not needed because the application does not pick up the word when it is said in a sentence, but only if it stands alone. Therefore we think no action is needed.
All in all the user completed the task successfully and told us that software was easy to use and intuitive despite the mentioned flaws.
Right after the introduction to the software he began to use the software. He used the gestures very intuitively despite the fact that he had little prior experience with the Kinect. Even though he navigated through the application intuitively he had problems performing the pressing gesture. The test supervisor could observe that the user had stretched out his arm entirely which resulted in having to bend over in order to perform a pressing gesture. One solution to this problem could be to replace the pressing gesture with another gesture that requires less movement. But we think that the pressing gesture is the most intuitive one because it resembles the actual movement that a person is performing when pressing a physical button. In our opinion these difficulties can be blamed on the little experience that the user had and that the problem will solve itself once the user gets used to it.
Since the software picks up both hands of the user, Leon tried to interact with both hands at the same time, which resulted in confusion because the software did not actually allow this. The solution could be to allow multiple actions at once. But we think this would pose more problems than it would solve since the causality might be lost (e.g. the delete button is pressed with one hand before the other hand was able to select the entity that is to be deleted). Therefore we decided to only display one hand at a time as long as the user is not on a view that contains a control that allows interaction with multiple hands.
When adding properties to the suspect the user wanted to create an alibi and did not quite know which of the property types are most fitting for an alibi. Therefore he suggested to integrate more types into the application. This would be a possible solution. On the other hand we might end up integrating dozens of different types which would make it even more confusing for the user to choose one. Our reasoning behind adding the existing property types was that they would provide some semantics for the software so it would be easier to interpret the data (e.g. showing entities in the timeline or on the map). Therefore we only added the types that are needed by the application to interpret the data correctly. From this standpoint it would not make any sense to add more types. Also there is a generic property that can contain any type of information.
The user complained that when creating a new property the name has to be entered before defining the property type. He thought this to be very unintuitive. The solution to this is obvious: the steps of defining the type and entering the name have to be switched.
When trying to delete a property, Leon wanted to grab and drag the property to the delete button (a gesture that is well-known from applications like Photoshop). In his opinion it is too difficult to have to the navigate to the property first. The problem of the solution he proposed (dragging and dropping the property to some kind of trash bin) cannot be implemented because the properties are displayed in a list and the grabbing gesture is therefore already used to scroll through the list.
While creating the alibi the user wanted to create a property that represents a time span rather than just a point in time. The solution would be to simply add another property type for time spans.
After the user had finished his task he played around with the application and tested some other functionality. In his opinion the map can be used very intuitively. The only problem that he had was that when the hand is out of the screen space the grabbing gesture is not recognized. The problem arises due to the fact that the user cannot see what the camera sees. A possible solution would be to display the depth or the color image of the camera to show the user where the camera sees his hand. This feature is actually already implemented. When the application is run in debug mode the depth image of the camera is displayed to the user. We thought it to be helpful while debugging but of no import to the user and therefore we removed it from the release version.
Leon also tested the speech recognition thoroughly. At first he tried to use intuitive speech commands, which obviously did not work. A proposal for solving this problem has already been presented in the evaluation of Sascha. Also he did not understand what the „<>“ notation meant. It actually means that the user has to fill in the name of the object he wants to perform an action on. This problem of understanding could be solved by displaying the actual options, for example with an animation that cycles through all options. He had the feeling that he had to say „CRIME“ all the time because the speech recognition deactivates itself when the user has not interacted with it for 10 seconds. The obvious solution would be to increase this time to 15 seconds or even more.
Just like Sascha he thought that the word „CRIME“ was not fitting for the keyword to activate the speech recognition because it might be used in conversations. The reasons why this is not a problem were already discussed above.
Also Leon was confused by the structuring of the data in the application (cases are made up from entities and entities are made up from properties). For example he could not think of a reason why locations are no entities themselves instead of just properties. The reasoning behind this was already explained in the last blog post.
All in all he got used to the application really quick and could solve his task in a short amount of time.
In order to make it more clear to the user how to interact with the application using gestures and speech commands, we integrated some small hints that are displayed once the application starts. These are only shown once right after the start of the application and are aborted once the user starts to interact with the application. In our opinion this is the perfect compromise between helping novice users and not getting on the nerves of experienced users. This was made to address the problem that some users had to find out how to interact with the application.
At first we wanted to let users dictate text to the application as an option for text input. It turns out that it does not work very well, therefore we had to come up with a better solution. Our approach to this problem is a companion app that can be run on a tablet. Every time the user encounters a view where he has to input text the app enables him to type it in. This even goes as far as having different ways of entering information for different data types, e.g. when the user has to type in a date he presented with a date picker. Also the application enables users to upload pictures from the tablet to the application. The app and C.R.I.M.E are synchronized in real-time so that the user is not disrupted.
Now all entities that contain location properties are displayed on the map. The entities are displayed as pins that display some information about the entity. When clicking the pin the user is navigated to the entity view. There is also a view that contains all pictures of all entities of the current case. When pressing a pictures the user is also navigated to the entity view.
The test users complained that the time before the speech recognition gets deactivated is too short. This problem was solved by increasing the time from 10 to 15 seconds.
Also some bugs were fixed. For example when an entity or a property was renamed and the user navigated back to the entity/property view, the names had not updated. These bugs are now fixed.
One of the test users wanted to interact with the application with both hands even if that was not possible in the context. Therefore only one hand is now displayed. When the user navigates to a view that contains a map then the second hand is activated in order to be able to perform the pinch-to-zoom gesture on the map.
Finally when a new entity or property is created then the user has to first select the type. This change was made due to the problem that one of our test users had.
This video demonstrates the result of our first software prototype iteration.
In this week’s assignment we had to develop a functional software prototype. The software prototype is supposed to be a horizontal prototype, which means to offer a broad set of functionalities instead of making every detail perfect. We decided to focus on creating cases and investigation data and left out other parts like displaying the data on a timeline or on a map.
We produced a short demonstation video for the prototype which has been published in a separate blog post.
During the implementation we made some minor changes to the design compared to the paper prototype. First of all, we decided to make a location rather a property of an investigation entity instead of making it an entity itself. We found that there would be too many entities if we wanted the locations to be precise because this would mean that almost every entity would have an associated location entity for itself.
At first we wanted each type of entity to have its own set of predefined properties. This would cause a lack of flexibility. So instead, investigators are able to attach an arbitrary number of properties to any type of entity. The entity type is now only important for distinguishing entities.
In the paper prototype the user had to import photos before creating the entities. In our opinion, this is confusing as many people are likely to think that the photos are already entities. Therefore the user now has to attach photos to entities.
In this week’s assignment we had to conduct user tests with our paper prototypes. Since our project has a very specific user group that is hard to get by, we chose to conduct our tests on three friends of ours: Robert, Max and Matthias. In this blog post we want to present our observations during the user tests and show the resulting changes that we made to our paper prototype based on these observations.
The first user adapted to the possibility of using voice commands using the keyword „CRIME“ very quickly. He totally ignored the gesture based navigation. This confirmed our thesis that the prototype is lacking a hint for the possibility of navigating using hand gestures. The user also criticized the missing of a back button and/or voice command for navigating to the previous view. He once navigated to a view he didn’t intend to go to and had no way of going back. Further more there was no or at least no obvious way to select multiple entities using a voice command, which is needed to establish relationships between entities.
After opening the case as described in the scenario the user was presented with an empty page. The folder icon right next to the page title suggested that there were hidden options, so the user started out by using hand gestures trying to click the folder icon. He employed the gesture control very intuitively and thus ignored the possibility of using voice commands at first. When he first tried to use the voice control by saying „CRIME“ he directly went to the entity map. Since there is no option for navigating back he was stuck and could not finish the task of uploading pictures to the case. When he was asked to create a new evidence with one of the uploaded pictures he was irritated by the concept of creating a new entity out of a picture. In his opinion the pictures should rather be tagged e.g. with „is evidence“, or „is suspect“. Generally user 2 favored the gesture control over the voice commands and suggested that a „grabbing“ gesture should be added, which could be used to pan (e.g. in the entity map or the timeline).
When uploading the pictures from the camera using the „upload pictures from <camera>“ voice command the user was irritated that the pictures were already uploaded automatically. He thought that he had to select the photos which he intended to upload first. After creating the entities from the pictures he wanted to create a connection between two entities. Unfortunately the „create connection“ voice command was missing from the context menu. The user 3 also criticized the lack of a back button. When he was on the entity map, the user assumed that he could access more information about the entity by selecting it. This feature was missing because otherwise the prototype would have gotten too complex. Finally we could observe that navigating through the prototype using voice commands took unproportionally longer than using hand gestures. But we think this is just due to the lack of experience of the user.
Picture 1 – We addressed the problem that multiple photos could not be selected using voice commands by introducing a new voice command for selecting multiple photos at once. Also a new option was introduced for switching into a selection mode were the user is able to select pictures using a hand gesture.
Picture 2 – When the user holds up his hand and the cursor pops up a back button is faded in, which enables the user to navigate to the previous view.
Picture 3 – A context menu was introduced in the entity map. It pops up when the user selects entities. This was made to address the problem that user expected that more information would be shown once an entity is selected. In the future we could add an option for displaying all the information about the selected entity.
Picture 4 – In order to address the problem that some testers had a hard time figuring out that they were actually able to use their hand as an input device, we added a hint. When the user does not interact with the prototype for a couple of seconds the message „Please say ‚CRIME‘ or use your hand as a cursor“ pops up.
Picture 5 – When the user is not using hand gestures to navigate through the prototype he is reminded that he is able to say „CRIME back“ in order to navigate to the previous view. This change was also made to address the problem of navigating back.
Picture 6 – One of our testers was confused that the pictures were already uploaded after using the voice command „Upload pictures from <camera>“. Therefore we added a message which is displayed when the upload is done.
Picture 7 – The first version of the prototype did not have a command for creating a connection between entities, so we added a menu option for that.
Picture 8 – We also added the same menu to the entity, since it is just another way of displaying the entities stored in the case. Now the user does not have to switch between views in order to perform certain tasks.
Picture 9 – In conjunction with the problem that the users could not navigate back we also added a close menu command to the context menu, so that users can close it when they either decided to do something else or accidentally opened the menu.
In this week’s task we were asked to create a low fidelity paper prototype, which can then be used to perform usability evaluations. In these usability evaluations our team acts as the computer and performs all actions that the user wants to do. The feedback of the test users can then be used to overthink the design of our software and make changes if necessary. Since we are still in an early stage of the development it is very easy to perform major changes to the software design, which we won’t be ably to incorporate later on. The following photos show the prototype as a whole.
A comprehensive view of the paper prototype.
The case management parts of the paper prototype.
The entity management parts of the paper prototype.
The timeline and the entity map.
The voice control parts of the paper prototype.
Here we want to depict three different common scenarios of the software using our paper prototype to show how easy it is to perform user tests with it.
The first scenario shows how the user can enter data into the system using the prototype. The scenario starts out with the system waiting for input. The user then proceeds to upload pictures from the camera and creates a new evidence entity using one of the photos.
Step 1 – The start screen of the application.
Step 2 – The application prompts „Please say ‚CRIME'“ in order to show to the user that it is awaiting input.
Step 3 – When the user says „CRIME“ all options that are possible in this context are displayed.
Step 4 – The user chooses to open the case #1.
Step 5 – The user says „CRIME“ and the application displays all contextual options.
Step 6 – The user chooses to upload pictures from a camera. The application proceeds to upload the pictures.
Step 7 – The application displays the uploaded pictures.
Step 8 – The user says „CRIME“ and the application displays the contextual options.
Step 9 – The user selects the picture #2 and the application displays a context menu with the options.
Step 10 – The user uses a hand gesture to move the cursor over the „Create evidence“ option. By holding the hand in position the option gets invoked.
Step 11 – The application opens a prompt where the user enters a new name for the evidence by speaking it.
The second scenario depicts how connections between entities can be created. The user selects two entities and then proceeds to create and name a new relationship between them.
Step 1 – The user holds his hand up and the cursor appears on the screen.
Step 2 – The user holds the position of his hand to select an item.
Step 3 – The circle around the cursor fills up to signal that it is selecting the item.
Step 4 – The item #2 is selected.
Step 5 – The user proceeds to select the item #3.
Step 6 – The circle around the cursor fills up to signal that the item is being selected.
Step 7 – The item #3 is selected.
Step 8 – The user speaks the voice command for creating a new relationship between the two selected items and names the relationship by speaking the new name.
The third scenario shows how the data, that is stored in the database, can be viewed and thus be used to derive new clues that might be vital to the investigation. The user displays the entity map an then shows the entities in the timeline.
Step 1 – The user says „CRIME“ and the application prompts the contextual options.
Step 2 – The user says „Show entity map“ to navigate to entity map that shows all entities belonging to the case and their relationships to each other.
Step 3 – The user holds up his hand and the cursor appears on the screen.
Step 4 – The user moves his hand to pan the view of the entity map.
Step 5 – The user proceeds to pan the entity map view.
Step 6 – The user says „CRIME“ and the application prompts the contextual options.
Step 7 – The user says „Show events on timeline“ and the application navigates to the timeline.
Step 8 – The user holds up his hand and the cursor appears on the screen.
Step 9 – The user moves his hand to pan the view of the timeline.
We conducted two interviews, one with a private detective and one with a police detective, in order to gain more insight in the investigation process of both private and federal investigators. In this blog post we want to publish our findings.
Our first interviewee is a 43 year old private detective who owns a private investigation firm. She has many years of experience in private investigations and has a degree in criminalistics. She has a managerial position in the company and acts a team leader in most of the investigations.
We split the interviews up into several different areas of interest. We want to present our findings based on these areas.
Company structure – The private investigation firm has a flat hierarchy where a certain set of people work as a team on a case. The team is lead by a team leader who is usually the owner himself.
Procedures – The research for a case is done by using specialized databases and the internet. For each step in the investigation process a report is written. The line of reasoning in the reports is supported by documentary evidence such as photos, videos and tracking information (via GPS).
Data protection – All evidence is stored in sealed storage units. Only the team leader is allowed to access the storage units and each access is protocolled. The protection of the data is of paramount import and therefore the team uses advanced encryption technology, virus protection software, firewalls and has no open WiFi.
Closing of a case – A case is considered closed when all requirements of the client are met. The client receive the results including crucial evidence. All other material is physically destroyed.
Feedback – The will to used optimized software and specialized technical is tools is there but the legislator sets a framework of conditions that have to be met.
Our second interviewee is a 39 year old police detective who works for the criminal investigation department. He finished his professional education at the police.
Company structure – The typical deep hierarchy of the police departments in Germany. In interdisciplinary teams a coordinator is appointed.
Procedures – In field work forensic scientists are summoned as required. Usually the detectives jot down the witness statements and the crime scene is documented using photos and/or videos. Later on a report is written for the operation. Research is done using specialized databases and older cases are consulted for guidance.
Data protection – Collected evidence is stored in the evidence room and each detective accessing evidence is protocolled. Digital data is exchanged using encrypted portable storage devices. For reasons of security we were denied further information about the IT infrastructure.
Closing of a case – A final report for a case is commissioned by the lead investigator. Evidence is kept according to federal law (most likely for several years).
Feedback – Such a system could be evaluated by using it in smaller cases. The investigator could envision using smart devices like tablets in the field work.
We were asked to formulate three design principles that we derived from our interview findings:
We created two storyboards that show how people could interact with our software. We chose to depict two different scenarios. The first scenario shows how crime scene investigators could collect evidence from a scene of crime and populate the database with information and photos of the evidence. The second scenario shows how investigators could use the system to evaluate the information they gathered and how it might help them solve their case.
This storyboard is set in a scenario where the investigators first arrive at a crime scene. A person was murdered and the investigators are collecting the evidence. Later on they document the evidence by inserting them into the database of our software. Once all evidence was entered into the system, connections between them are created. Creating these connections is a key feature of the system and helps the investigators to set pieces of information into a greater context by adding semantic information to the relationship of pieces of information to each other. There will be different types of information (or entities as they are called in the software). The types could include locations, evidence, persons, etc.
This storyboard is set in the same scenario but a little further down the investigation process. All evidence is gathered and evaluated, and several suspects have been interviewed. Now the investigation team gathers and uses the system to find new clues. This storyboard shows the three main tools that the software offers: the entity map, the timeline, and the map. The entity map shows all pieces of information that the investigators gathered and puts them into context by showing their relationship to each other. The timeline and the map put the information into a temporal or spatial context to each other respectively. The investigators use these tools to find out which of the suspects could most likely be the killer. Finally they are able to narrow it down to one person and solve the case. As seen in the storyboards the users can interact with the system in several different ways, e.g. voice commands, hand gestures, and the usual forms of input devices. We want the software to be agnostic to the form of input, so that users may interact with the system in a way that suits the task and their personal preferences.
We started out by thinking about who the people were that would eventually be using the software we are going to develop. Our main target group are investigators and private detectives. Because of that very specific target group we were not able to conduct our interviews on friends or passerby’s. Our target group of people is extremely hard to get by but we tried to get some of them to give us an interview nonetheless. We contacted a lot of police departments and private investigation firms and asked them for an interview, but unfortunately the timeframe of two weeks was too short of a range for most of them. None of them consented to give us a personal interview but two of them asked us for the questions and agreed to answer them and send us their results. Both of them did not return their results in time. Therefore we could not complete our assignment and weren’t able to draw conclusions and create storyboards based on our findings.
With our interviews we wanted to reach two goals: firstly we wanted to find out how our target group does their every-day work and how they would engage in the radically different system that we thought up. Secondly we wanted to find out more about the psychological aspects of such a system. Therefore we broke the interview into two parts. The first part is designed for our target group and is therefore semi-structured and consists of a set of questions. The second part is designed for a psychologist and is unstructured. It only consists of basic areas that we laid our focus on. We found a business psychologist who is an expert in finding out how people interact with products and what their problems and fears are.
Since all of our interviewees are Germans we designed our interview questionnaires in German. We split up the questions of the questionnaire for our target group into several different categories.
Begrüßung. Wie heißen Sie?
Wie alt sind Sie?
Welcher Arbeit gehen Sie nach?
In welchem Betrieb arbeiten Sie?
Welchen Bildungsweg sind Sie bis zu Ihrem Job gegangen?
Wie wird aus technischer Sicht aktuell im Betrieb gearbeitet? Welche elektronischen Geräte werden verwendet (PC, Smartphone, Digitalkamera)?
Welche Personen sind in einer Ermittlung tätig?
Gibt es eine Strukturierung/Hierarchie innerhalb des Betriebs? Hat die Arbeit eine Projektstruktur oder gibt es eine feste Arbeitsverteilung?
Werden regelmäßig/bei Bedarf Meetings einberufen?
Können Sie den Ablauf eines typischen Außeneinsatzes schildern?
Können Sie die Arbeitsaufgaben im Bereich der Büroarbeit beschreiben?
Auf welchem Wege werden neue Erkenntnisse aus den vorliegenden Beweisen gezogen? Wer ist in diesem Prozess vertreten?
Welche Daten werden im Laufe einer Ermittlung gesammelt?
Auf welche Art werden Daten und Beweise zusammengetragen (digital, schriftlich, Beweismittel)?
Existieren Protokolle über Beweisketten (Beweismittel) und Quellen (Aussagen, Erkenntnisse, Beobachtungen)?
Wie werden Daten (IT, Aktenordner) und Beweise gelagert?
Welche Personen haben Zugriff auf die gesammelten Daten?
Wie wird mit außenstehenden Personen verfahren (Datenschutz, z.B. Spurensicherung)? Haben diese Personen auch Zugriff auf die Daten?
Welche Infrastrukturen sind im IT-Bereich vorhanden? Welche Sicherheitsmaßnahmen werden ergriffen?
Abschluss eines Falls
Wann gilt ein Fall als abgeschlossen? Wird ein Abschlussberichtangefertigt? Wenn ja, von wem?
Was passiert mit den gesammelten Daten und Beweisen nach Abschluss eines Falls?
Wie könnte ein solches System in Ihren Alltag integriert werden?
Könnte man mit weiteren Hilfsmittel (Tablets, Smartphones) den Außeneinsatz optimieren oder vereinfachen?
For the psychological interview we only defined some general areas of interest that we wanted to address during the interview with the business psychologist. These are the areas that we wanted to discuss:
With these areas of interest we wanted to find out general flaws in our idea. Usually we would have had to wait for a user to actually engage with an early prototype of our system before we find out any general problems. In order to avoid some general pit falls we wanted to talk to someone who has a lot of experience with great product that failed because they had some basic flaws that detained possible customers from actually using the product.
The psychological interview is the only interview that we were actually able to conduct, we do not have any findings concerning our user group. Therefore we are only able to present the findings concerning the psychological aspects of our project. We structured our findings according to the three areas of interest that we defined in our interview.
Data protection caveats
Integration in the everyday work
Since we are lacking the results of two interviews we saw ourselves unable to evaluate our findings and therefore decided to wait until all interview result arrive. Also we did not want to create our storyboards until we evaluated all of our interviews because it wouldn’t make any sense to create storyboards and overthrow all of it after we did a thorough evaluation of the interview.
Today, we have finally settled on a toolchain for the development of our project, therefore I want to give you some insights into the tools and technologies we are planning to employ:
Since some of the team members are not familiar with these tools, we will have a tutorial tomorrow, where some of the tools and especially C# will be thematized.