CSCE 436: Human-Computer Interaction

26 April 2011

Paper Reading #20: iSlideshow: a Content-Aware Slideshow System

Commentary

See what I have to say about ___'s and ___'s work.

References

Chen, J., Xiao, J., and Yuli, G. (2010). islideshow: a content-aware slideshow system. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

This paper details a presentation system that groups and transitions based on content rather than on by some arbitrarily defined effect. Content-based grouping allows the system to create one larger image from multiple smaller images that are seamlessly tiled together. The researchers implement a comparison algorithm that ensures a good flow of color and content from the edge of one image to another, building the entire scene iteratively:

Image courtesy of the above-cited article.

The researchers also use facial recognition to generate transitions that are relevant to the scene. Once photos are grouped by similar content with respect to facial recognition, transitions take into account the positions of faces between various photos in the group and build transitions based on those locations.

The results of a user study conducted by the researchers show that their user base enjoyed using their system more than others based on aesthetic appeal and fun. Several users mentioned that the slideshows seemed more meaningful with the content-aware transitions.

Discussion

I feel like this is a pretty innovative use of a content-aware system for manipulating photos and such. The fact that it was so well received by the users included in the study leads me to believe that this type of system would be very successful in a mainstream market. I personally would like to try it out! I feel like the current success of arbitrary transition effects and vanilla slideshow presentations is due largely in part to the fact that they are just so easy to implement, and any of the more impressive effects, like the onces detailed in this article, are viewed as being too difficult for anything but a Photoshop poweruser or something similar. I'm glad to see the steps that these researchers took with respect to interface design and making the creation of aesthetically pleasing presentations fun and exciting.

Paper Reading #15: TurKit: Human Computation Algorithms on Mechanical Turk

Commentary

See what I have to say about ___'s and ___'s work.

References

Little, G., et al. (2010). Turkit: human computation algorithms on mechanical turk. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researcher describe TurKit, a scripting environment for Amazon's Mechanical Turk (MTurk). MTurk allows people to post Human Intelligence Tasks (HITs) as jobs, for which MTurk Workers (Turkers) may be paid a few cents or more, depending on the difficulty of the task and the amount of time taken to complete the task. Some tasks include adding tags or descriptions to images, organizing images based on their content, or voting on the most descriptive or grammatically correct passage of writing in a set. MTurk provides an API that allows users to interface directly with the job creation system. The TurKit environment helps people who post jobs to automate the task of posting jobs and process iterative jobs as well.

Image courtesy of the above-cited article.

It utilizes the crash-and-rerun programming paradigm, in which a script is run until it crashes or is terminated, and then restarts from the beginning. The key advantage provided by TurKit is that posting jobs costs money, and with this system, posters can be ensured of not reposting tasks that have already been completed, thereby saving money. TurKit also provides for the automation of parallelized tasks as well, a key advantage of MTurk.

Discussion

I'm such a huge fan of automation. Parallelism is an added plus, which I'm hoping is something that will start to catch on here in the not-too-distant future. I also like when system maintainers publish an API so that users with that DIY ethic can tinker around and built something that is incredibly useful, if not to the general population then at least for themselves. In short, I like everything I've heard surrounding MTurk and TurKit.

I think that the thing I'm most excited about with respect to this article is the practicality of TurKit. It takes advantage of all of the best parts about MTurk and makes them more accessible. It's a simple design that is very well executed. Nice work :)

Paper Reading #14: A Framework for Robust and Flexible Handling of Inputs with Uncertainty

Commentary

See what I have to say about ___'s and ___'s work.

References

Schwarz, J., et al. (2010). A framework for robust and flexible handling of inputs with uncertainty. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This paper details a system that handles uncertain or ambiguous input. The researchers have devised a system that extends the conventional input system. Whereas in a conventional system, a user action either causes or does not cause a system action, in the uncertain system, all possible actions are taken into account and the most probable action is chosen. Actions do not cause final, irreversible changes to the system until temporary feedback is given to ascertain the intended input or the inferred action crosses a certain probability threshold. In the event that an action does not cross the "most probable" threshold, the user is given feedback of certain types when performing an action and may alter the action to generate the desired response. One possible temporary feedback type is detailed in the images below. As the user tries to select one slider, both are accidentally activated. A conventional system would just select one slider regardless of user intention, or do nothing at all. The uncertain system selects both sliders, gives temporary feedback on the possible state, and then allows the user to correct their input before a finalized action is taken.

Image courtesy of the above-cited article.

This system can handle uncertainty with both graphical and text input, including activation of multiple tiny buttons; inexactly placed input for scrolling, resizing windows, and dragging and dropping icons; multiple interpretations for spoken input to text translation; and greater ease of use for people with motor impairments.

Discussion

As with most projects that are at least latently philanthropic in nature, I really enjoyed this paper. For starters, gently correcting for erroneous input seems like a great idea, and it seems that this system does this without generating a large amount of overhead. We are already used to word processors automatically correcting our commonly misspelled words, or Google showing us results for what they think we really meant to search. Second, their results show a high rate of success in increasing ease-of-use for motor-impaired individuals, which is awesome. Admittedly, automatic "corrections" or "suggestions" can sometimes be pretty annoying; from what I've read, this system seems to strike a good balance.

Paper Reading #13: Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

Commentary

See what I have to say about ___'s and ___'s work.

References

Patel, K., et al. (2010). Gestalt: integrated support for implementation and analysis in machine learning. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presents a machine-learning approach to debugging with their system, Gestalt. They utilize a gesture recognition system to check for bugs in source code and issues where the system recognizes or fails to recognize a gesture.

Discussion

In all honesty, this paper was completely incomprehensible to me.

Image courtesy of Queen Michelle

Paper Reading #12: Pen + Touch = New Tools

Commentary

See what I have to say about ___'s and ___'s work.

References

Hinckley, K., et al. (2010). Pen + touch = new tools. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researchers examine the capabilities of a multimodal interface with touch- and pen-based interaction. They believe that interactions can be separated into unimodal and multimodal categories based both on input device and intended action.

In the first phase of their project, the researchers conducted an initial study where subjects used pen, paper, and various tools to organize objects in a notebook. They took note of common actions that all subjects performed, and what affordances a fully manipulatable environment granted them. Some examples that helped the researchers categorize different unimodal and multimodal commands include the subject tucking the pen in the fingers to manipulate objects in the environment, using only fingers to hold down or reposition objects, or using objects as part of the environment with one hand while drawing with the other.

Using their observations from the first phase, they designed an interface using the Microsoft Surface system that incorporates as many natural affordances from the first phase into the second: a multimodal pen-and-touch interface. Touch and pen input can be unimodal and have their own affordances in these cases; when the input is combined, i.e. multimodal, the context of the interactions changes and a new set of affordances become available. They categorize this difference as such: "...the pen writes, and touch manipulates, period." Some of the combined, i.e. multimodal, interactions included: holding objects together and tapping with the pen to "staple" them; holding an object steady and using the pen as an X-acto knife; holding an object steady and creating a "carbon copy" by dragging a copy off of it with the pen; and holding an object steady and using it as a straightedge along which to draw with the pen.

Using an object in the scene as a straightedge.
Image courtesy of the above-cited paper.

Discussion

This interface is even more intuitive than the last one I reviewed! And I love it! I appreciate the work that the researchers put into observing natural interactions with the type of environment that they wanted to create. This seems the be the smartest way to make an interface parallel interaction in the real world, and indeed, to allow an interface to achieve its maximum usability potential. The way they separated out the roles of touch and pen was ingenious. All in all, this is one of the best designs for a new interface I have seen throughout these papers. I'm almost as excited about this interface as I am about the Minority Report-style interface :)

Paper Reading #11: Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

Commentary

See what I have to say about ___'s and ___'s work.

References

Zelenik, R., et al. (2010). Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

The researchers for this paper present a fusing of two technologies that seem to complement each other well: pencil-and-paper mathematical calculations and a Computer Algebra System (CAS). They argue that pencil-and-paper provides a greater degree of freedom with respect to spatial interaction and is more intuitive because of the physical method of input. They do not discount the usefulness of a CAS in solving complex equations much more quickly and efficiently than could be done by hand, however. They have created a system which uses the Microsoft Surface system which combines these two interactions and interfaces, along with multitouch capabilities and multimodal input, i.e. light pen and fingers.

The researchers detail the capabilities of their system in depth. The system affords the user the ability to create and delete pages, pan across the virtual tabletop, access context-specific menus, and influence the context of gestures dependent on the combination of pen and touch inputs utilized. The researchers also implemented preexisting Software Development Kits (SDKs) to complete the mathematical operations. Overall, the results from a user study they conducted support their hypothesis, but offer many suggestions for future expansion as well. Specifically, users detailed what functionality they would have liked to see added to pages, e.g. growing or shrinking pages, or having data on one page accessible to all pages; and issues regarding the necessity of having both hands free for gesture input.

Discussion

I really like this design concept. I feel the exact same way as the researchers do about the affordances of pencil-and-paper and CAS. Personally, I can't wait until this happens:

Image courtesy of Niobium Labs

There is something that is just so inherently intuitive about touch interfaces: every interaction between ourselves and out=r physical world is through some form of manual manipulation. I'm glad that these types or interfaces are becoming more mainstream. That is all.

Paper Reading #10: PhoneTouch: A Technique for Direct Phone Interaction on Surfaces

Commentary

See what I have to say about ___'s and ___'s work.

References

Schmidt, D., et al. (2010). PhoneTouch: A technique for direct phone interaction on surfaces. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presented a novel interaction scheme utilizing a mobile phone as a stylus in conjunction with a finger, with interaction-dependent responses based on whither a phone or a finger was used. The interface was a tabletop touchscreen that connected to a phone or several phones via Bluetooth and discriminated between phone and finger touches via impact size and accelerometer data that was sent by a device attached to the phone. The image below depicts this setup:

Image courtesy of the above-cited article.

The idea behind this design is that interaction with the interfaces changes context when a phone is used over a finger. For example, a user can open up their photo album, touch the interface with their phone, and have the photos displayed on the interface. They can then move images around, examine them, and separate them into groups, and then transfer those images to another phone by tapping on the group with the phone. In general, the results coincide with what the researchers were hoping to achieve.

Discussion

This is an interesting concept, and I'd like to see it implemented in the real world, but I believe that will only happen once a device like the interface becomes more multipurpose, and not to mention mainstream. There are already table-type touch interfaces, but they are not very widespread. The benefit of a personal computer is that they are prolific and multipurpose, i.e. one can do more than just interact with photos and files on a phone. The cost of this type of system would make it prohibitive to own except for possibly a gadget junkie until the above issues are addressed. Really neat concept though :)

Paper Reading #8: Communicating Software Agreement Content Using Narrative Pictograms

Commentary

See what I have to say about ___'s and ___'s work.

References

Kay, M., and Terry, M. (2010). Communicating software agreement content using narrative pictograms. Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

These researchers set out to address several common issues with software agreements, such as End User License Agreements, and their reception by a user base. A few key issues that they cite are the amount of text and its relative reading level, the fact that some agreements are only localized to several locations, and the success of pictorial communication of ideas and the absence of its use with respect to software agreements.

The researchers set up a system of pictograms depicting how content and usage information would be collected from an image editing program. They utilized four sets of diagrams, each depicting a different type of data that would be collected and how it would be used. Below is one such diagram with the explanatory text removed, as one of the subjects of the study might have scene and have had to interpret:

Image courtesy of the above-cited article.

In general, the results were promising: their initial test group had some difficulty identifying the concepts outlines by the pictograms, but with some slight modifications, the rest of the subjects did sufficiently well. Having basic text explaining the images did much to increase understanding of the images, and was much easier to read than the software agreement itself

Discussion

I enjoyed the concept of pictorial descriptions of license agreements, but I doubt that the licenses will ever go away entirely. They are obviously necessary for legal reasons, and in some instances a user will trudge through an agreement out of boredom or curiosity or out of a genuine desire to know his limits and freedoms. A main issue that the researchers will have to overcome is comprehension of the pictures. The licenses themselves, while sometimes inherently incomprehensible due to the high level of "legalese," still provide an accurate and complete description of the responsibilities of both the user and the company providing the software. The pictures, while only meant to "augment the text," as the researcher put it, will still have to provide an accurate description of the text. I can only imagine the problems that would arise, what with this country's affinity for frivolous lawsuits and the like:

Image courtesy of Natalie Dee

Paper Reading #7: Hard-To-Use Interfaces Considered Beneficial (Some of the Time)

Commentary

See what I have to say about ___'s and ___'s work.

References

Riche, Y., et al. (2010). Hard-to-use interfaces considered beneficial (some of the time). Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

This group of researchers took an entirely novel approach to evaluating their systems for usability. They examine how something that would initially have been considered a bug or a barrier to usability from their perspective and instead treated it as a feature.

Image courtesy of The Geek Whisperer.

In their first study, the researchers placed a group of computer scientists in a collaborative design environment, who discovered a bug that linked their interaction with the interface to each other's actions. The solution that the group implemented themselves was to increase social interaction outside of the system, e.g. waiting for someone else to finish a task before beginning their own, or asking the group to pause so that they could complete their task. The results of this collaboration is that this group that had to deal with the "bug" had a higher ratio of satisfaction with the system because interacting allowed them to make fewer errors related to each other's actions.

In the second study, the researcher questioned a group of older individuals to gauge their satisfaction with new interfaces that are designed to make interacting with technology easier. The results of this examination, however, proved that the group did not value correspondence that was generating with the help of technology as much as they valued the same that was generated by hand. For this group, apparent effort equated to higher value. The researchers' proposed solution to this problem was to make an interface explicitly harder to use to increase the explicit value of messages generated by the interface.

Discussion

I was really excited to see the evaluation of this specific issue, a "bug" becoming a "feature" instead of being immediately fixed. It obviously worked out well in these cases, and speaks volumes about our ability as a species to overcome adversity. I know that sounds really cheesy, but it's true. Our society is all about instant gratification: I want this and I want it now! Things like this force us to take a step back and actually embrace our humanity rather than push it to the wayside. For some definition of the word "humanity," that is... Now this isn't to say that I don't appreciate the fast response of Google or the wide knowledge base the something like Netflix or Pandora utilizes to generate content and so on. I like those services for what they are: tools. Knowing that they exist, would I be disappointed if I couldn't use them anymore? Yeah, I would. Would I mourn their loss like the loss of a friend? Absolutely not. I don't know if there's anyone out there who would freely admit that they would, but take a look at how "constantly connected" we are to our technology. If you've ever done it, you know how nice it is to unplug every once in a while.

Image courtesy of Trip Advisor

25 April 2011

Paper Reading #24: A Natural Language Interface of Thorough Coverage by Concordance with Knowledge Bases

Commentary

See what I have to say about ___'s and ___'s work.

References

Han, Y., et al. (2010). A natural language interface of thorough coverage by concordance with knowledge bases. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

In this paper, Han et al. discuss a novel approach to solving a common problem with natural language interface (NLI) systems. As illustrated below, of the total set of expressions a user might possibly input, there is a partial disparity between those expressions that a given system can interpret and those which a given knowledge base can answer.

Image courtesy of the above-cited article.

Whereas most NLI systems try to make up the difference by expanding the number of expressions interpretable by the system, this team proposed to generate the interpretable expressions from the expressions answerable by the knowledge base instead. They identify several levels of classification based on a graph representation of the knowledge base, and generate all queries for a given level that are answerable. They then cast user expressions as one of the evaluable expressions by a similarity measure.

Discussion

I don't know how ~~many~~ any NLI systems work, but honestly, this approach, while (purportedly) novel, does not seem like a great feat of science and research. Maybe it is, I don't know: like I said, I know nothing about NLI systems. But I feel like I might possibly have stumbled upon this concept if you gave me the image above and described the problem to me. I'm just saying. That being said, was 2010 really the first time that anyone has thought to cast the problem like this, or was that just the first time anyone had thought to publish a paper on the topic? It just really seems like this specific problem is one that computer science as an all-encompassing entity would have solved a long time ago. Please, rebuke me, correct me, enlighten me if you can.

19 April 2011

Paper Reading #25: Using Language Complexity to Measure Cognitive Load for Adaptive Interaction Design

Commentary

See what I have to say about ___'s and ___'s work.

References

Khawaka, M. A., Chen, F., and Marcus, N. (2010). Using language complexity to measure cognitive load for adaptive interaction design. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

The researchers addressed the problem of adaptive interfaces in this paper, specifically adaptive interaction design. They proposed a method of controlling the level of adaptation of an interface via the monitoring of speech patterns. Different patterns were mapped to different levels cognitive load, or how much of the brain's processing power is being used to compute a task. Current speech recognition capabilities make real-time implementation infeasible; instead, the researchers performed quantitative analysis on transcriptions of several training exercises at an Australian bushfire management facility. They measured semantic difficulty, or the use of words, and syntactic complexity, or sentence length. Their hypothesis proved correct save for one prediction. Using statistical analysis, they were able to apply different language complexity measures to accurately map sentences used in different situations to the appropriate level of cognitive load for that situation.

For lack of a more salient picture, here is a table of their results.

As speech recognition capabilities become more advanced, they hope to test their implementation in a real-time situation.

Discussion

I really enjoyed this paper. Sometimes, I think I should have gone into English, specifically linguistics. Don't get me wrong: I like to chill out a little bit in this here discussion section, but I feel that crafting a literary piece or a technical document is akin to creating a work of art. Languages are so easy to use in a practical sense, but they can be wonderfully complex if they are truly understood. These researchers understand language (or at least how to analyze it). What's more is that they understand the importance of language and communication. Once speech recognition is up to par with the vision that these researchers have laid out, our interactions with our machines will be as seamless as our interactions with each other. Technology is a tool, and nothing more, but capable, efficient tools only make sense; why would we not strive to reach the limits of our potential to create and design?

07 April 2011

Paper Reading #9:Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback

Commentary

See what I have to say about ___'s and ___'s work.

References

Gustafson, S., Bierwirth, D., and Baudisch, P. (2010). Imaginary interfaces: spatial interaction with empty hands and without visual feedback. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this article, a system called Imaginary Interface is described. It focuses on screen-less wearable devices, presents the setup and results of a user study conducted by the researchers, and details a mock up of such a device. The idea behind this interface is to have the user create a "screen context," in which they operate as if they were using a device providing spatial feedback:

Image courtesy of the above referenced article.

All of the spatial information is contained in the user's mind, however; all interaction is done relative to the user's frame of reference. Three user studies were conducted: the user drew commons shapes and letters multiple times, and the variation between consecutive drawings was analyzed; the user drew an image and then was required to identify a specific point on that image, both with and without changing their frame of reference, i.e. standing still or turning, respectively; and the user identified a point in a coordinate system, the units of which were in lengths of digits, i.e. (2,1) referred to two thumbs right and one finger up. The results were very interesting, casting this type of device as something that could be practically realized. Finally, the researchers proposed a possible design for the device, which works by illuminating the user's hands with infrared light, applying a luminance threshold, and discerning the structures that comprise the imaginary interface:

Image courtesy of the above referenced article.

Discussion

First of all, to say nothing of the subject of the paper itself, this was the most straightforward research I have yet encountered. The concept of the device was clearly laid out, motivation and previous work was well-presented, and the results of the user studies were very accessible. That being said, I want one. I am fascinated by the concept of wearable computing, and this just takes it one step further. I honestly believe that so many awkward interactions could be alleviated through the use of something like this, e.g. you're on the phone and you just cannot express some simple concept or idea, and if only you were able to just draw a simple diagram, or had a little whiteboard... I really like this concept. I think they did a great job at coming up with tangible results from creating a frame of reference to come back to without having to be in the same visual context, i.e. you can pull up your frame of reference anywhere you want by just popping up your thumb and forefinger. Great work :)

05 April 2011

Paper Reading #19: From Documents to Tasks: Deriving User Tasks from Document Usage Patterns

Commentary

See what I have to say about Shena's and Derek's work.

References

Brdiczka, O. (2010). From documents to tasks: deriving user tasks from document usage patterns. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, the researcher presents a novel approach to implementing a task management support system. A system such as this takes care of monitoring tasks of knowledge workers and clustering common tasks to increase productivity. The researcher argues that switching between multiple tasks is "expensive because each task requires some recovery time as well as the reconstitution of task context." This system is novel because it does not group documents based on title or content, both of which introduce privacy concerns. Instead, documents are applied a unique identifier and are filtered by their dwell time, or how long they have focus and are actively being accessed. Documents are then grouped by similarity via a spectral clustering algorithm. The proposed system had the additional benefit of not needing any user input whatsoever, a feature that is necessary for most other systems of its type. The system was evaluated over a period of a month, with observation days being non-contiguous. Normal knowledge workers were observed performing some commonly recurring tasks. The proposed system show a high level of effectiveness performing task grouping in comparison with similar systems.

Discussion

This paper seemed incredibly abstract to me. I'm sure that at least some of the industry-specific terms like "knowledge worker" and "recovery time" and "reconstitution of task context" must have meaning to someone out there, right? I just don't understand what the point of "task clustering" is supposed to be; what does it do? How does it increase productivity? To an lowly, uninitiated "luser" like me (NO I WILL NOT APOLOGIZE FOR THAT DON NORMAN), it feels like we're "promoting synergy" or some other dumb catch phrase:

Image courtesy of The Lonely Island and Oh! Ryan Kelley

I'm not knocking the paper, the author, or his work; he seems to have done a pretty stellar job. And I appreciate the novelty of his approach, not to mention the apparent success of his method. I just don't get it. At all.

02 April 2011

Paper Reading #16: Mixture Model based Label Association Techniques for Web Accessibility

Commentary

See what I have to say about Wesley's and Miguel's work.

References

Islam, M. A., Borodin, Y., Ramakrishnan, I. V. (2010). Mixture Model based Label Association Techniques for Web Accessibility. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

Islam et al. present a system they have created that extends the functionality of an assistive technology called screen reading. This technology utilizes text-to-speech to allow blind users to navigate websites by reading the content on webpages and descriptions of elements to them. A major impediment to the proper functioning of screen readers is the omission of labels for page elements and alternative text for images. Without proper labels, form elements can be misrepresented or not denoted at all. Without alternative text for images, transaction functionality for most websites is completely lost, as transactional dialogs are usually controlled through images, e.g. an "Add to cart" or "Checkout now" button:

Taken from the above referenced paper.

Even properly labeled items are sometimes not handled properly by the screen reader by virtue of the ambiguity of the HTML Document Object Model (DOM), e.g. labels for elements and the elements themselves being contained in different HTML table rows:

Taken from the above referenced paper.

The authors implemented a finite mixture model (FMM) to create contexts to which HTML elements and possible labels belong. Using these contexts, the FMM can also create labels for unlabeled elements with some accuracy and more correctly interpret labels for ambiguous objects. In evaluating their system, the authors observed a 76% success rate of correctly applying labels to their elements without prior training by their FMM and a 95% success rate with prior training when all elements were explicitly labeled. On a testing set without any labels, their FMM achieved an 81% success rate. In evaluating their system through a user study with two blind users who were proficient with screen reading technology, both blind users agreed that the FMM made interacting with webpages easier for them.

Discussion

Wow. I personally feel that this is some of the most incredible research I've discovered yet. The authors have created a system that solves a very practical problem, and solves it well. The idea of creating contexts from which to infer labels was ingenious, as was the systematic approach to evaluating documents geometrically. I have to commend them: the system they created seems to be entirely robust. In addition, while this work may have implications outside of catering to users who need assistive technology, I feel that there was at least some measure of philanthropic drive behind the project, purposefully or not. I approve of this work.

Paper Reading #17: Mobia Modeler: Easing the Creation Process of Mobile Applications for Non-Technical Users

Commentary

See what I have to say about Joshua's and Shena's work.

References

Baltagas-Fernandez, F., Hussman, H., and Tafelmayer, M. (2010). Mobia modeler: easing the creation process of mobile applications for non-technical users. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/.

Article Summary

In this article, the researchers present a system comprised of an abstraction model and its respective interpreter, called the Mobia Modeler. The aim of the system is to allow users without technical experience with respect to mobile platform programming to create practical applications with the modelling system. A usage scenario presented by the researchers involved a doctor creating an application that would monitor a patient's vital signs and alert the doctor or emergency officials in the event of abnormal or dangerous observations. The system produces two subsets of the modeling language: a Platform Independent Model (PIM) and a Platform Specific Model (PSM). Users create applications by linking components together via the graphical user interface, and the system's processor then translates the graphical model to platform-specific code based on the relationships between the components that were defined by the user. Put another way, the user writes and application in PIM, and the processor translates it to PDM. Participants included individuals from different fields and from different technical backgrounds, i.e. those with programming experience and those without. Evaluation revealed that the programmers rated the system higher than the non-programmers.

Discussion

As the authors stated, this isn't the first time something like this has been tried. For example, National Instruments has the LabView platform, which admittedly provides much more functionality and control than the Mobia Modeler, and is targeted at a much more technical crowd. Additionally, the Java programming language comes to mind in reference to platform-independent and -specific code.

A basic code example in LabView. Image courtesy of San Diego State University

That being said, I think the authors did a fantastic job of bringing these concepts together in a way that is accessible to users of all technical backgrounds. Indeed, it makes the most practical sens (to me) to implement the system in this way. Non-technical users will quickly come to learn how to control interactions between objects graphically, and the platform-independent and -specific paradigm makes this application marketable across a wide share of current mobile devices. It comes as no surprise to me that the programmers felt more at ease using this system: they already think in terms of the system. With a little more practice, however, I believe that the non-technical users would be up to speed with their more technical counterparts in no time.

Paper Reading #18: Evaluating the Design of Inclusive Interfaces by Simulation

Commentary

See what I have to say about Steven's and Evin's work.

References

Biswas, P., and Robinson, P. (2010). Evaluating the design of inclusive interfaces by simulation. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, Biswas and Robinson discuss their development of a simulator that evaluates usage scenarios for different assistive interfaces. Assistive interfaces refer to those interfaces designed to assist users who are physically impaired.

The Samsung Jitterbug, an example (sort of) of an assistive interface. Image courtesy of My Vision Aid, Inc.

Their study consisted of comparing the simulator's predictions of how long different tasks would take for users with various impairments against actual measured times for different users. The researchers identify text-search tasks and icon-search tasks, but specifically focus on icon-search tasks. The two subtasks tested were searching for an icon, and pointing and clicking on an icon. They varied the spacing between icons and font size for the icon captions. The participants in their test consisted of able-bodied individuals, individuals with vision impairments, and individuals with motor impairments. In computing the error in the simulator's prediction of how long the task would take, they found that the simulator accurately predicted task times with statistical significance.

Discussion

I am sure that there have been other systems similar to this that have been developed, but this is the first I have heard about such a system. This seems to be a natural extension of unit testing, where in this case the unit is the usability rather than the functionality of the interface itself. The obvious gain here is that, given the efficacy of this model in predicting performance, one need not waste the time and money to perform an actual user study on an interface: just input the impairment parameters and run the interface through the simulator. Of course, the interface would have to be codified as per the simulator's capabilities, i.e. one would need to know font size and distance between icons, but this seems like something that is a pretty important part of inclusive interface design anyway. What I'm saying is, it doesn't seem like it would be a big stretch to be able to set up an interface to be tested by this simulator. Here's an idea for future work: extend the simulator from processing an interface based on a flat screen to processing three-dimensional data. Simulate movement throughout an environment, say, a home? I'm a fan of optimization.

24 February 2011

Paper Reading #6: Blowtooth: Pervasive Gaming in Unique and Challenging Environments

Commentary

See what I have to say about Ryan's and Chris's work.

References

Linehan, C., et al. (2010). Blowtooth: pervasive gamine in unique and challenging environments. Proceeding of the Acm conference on human factors in computing systems (pp. 2695-2704). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Linehan and his associates studied the concept of pervasive gaming, in which one utilizes the real world to fuel interactions in a virtual environment. In specific, they tried to map the success of pervasive games and their applicability to various environments. They created a game called Blowtooth, a virtual drug smuggling game in which players dumped virtual contraband onto unsuspecting "mules" in international airports to get their stash through security, only to meet up with the mule once past security to retrieve the stash. An airport was chosen for the context of the game because it is a readily accessible real environment to which the concept of the game relates well. For this reason, Blowtooth is also a critical game, one which encourages the player to think critically about the ethical and societal concerns of the game and its implications.

The game works by searching for the unique ID of discoverable Bluetooth-enabled devices, storing them on the player's device, and then enacting a wait period before the player is allowed to "retrieve their stash," i.e. rediscover previously discovered devices. Non-players have no other role in the game than to possess a discoverable Bluetooth-enabled device. The players were later questioned on how appropriate they felt the environment was to the game, their level of satisfaction, increased levels of awareness of security and other passengers, and anxiety. The concept of "it's just a game" influenced the low levels of anxiety and concern for security, but the game did succeed as a critical game as reported by the players.

Discussion

Mwahaha! I love this game! Seriously, I want to play it myself. I feel that this game has very subversive undertones, regardless of whether or not they were addressed, but I understand that subversion isn't the point of the game. Anything that integrates a virtual world with reality so well is an immediate success in my book, especially as far as HCI is concerned. I would think that the recent focus on cloud computing will be the next big widespread pervasive technology advance, and this game really brings to light some of the implications of pervasive technologies. How easy it is already to drag-and-drop a file on your desktop into your Dropbox and then pull it up on your mobile device which is attached to a projector in a presentation, and then send the file out to everyone in the room with basically the click of a button. What's next?

Games, like art, serve no practical purpose, other than to provide a tangible link to ideas, emotions, history, and the like. I mean, clearly this is not the first time that this idea has been explored:

Image courtesy of Alidade Incorporated

But it is the first time that the pervasive, real world element has been included in the mix. This does not mean that the importance of art can just be discounted; on the contrary, more care must be taken to create and preserve representations of the intangible than the tangible. Physical devices are tangible; our interactions with them can be tangible and intangible; how we are engaged and affected by those interactions is intangible.

Blog Entry #5: Dance.Draw

References

Latulipe, C. and Huskey, S. (2008). Dance.draw: exquisite interaction. Proceeding of HCI 2008 (pp. 47-51). Liverpool: http://www.bcs.org/category/14372.

Article Summary

Dr. Latulipe's Dance.Draw project is part of a project called Exquisite Interactions, in which she explores the interaction of an artist with technology as they create their art. Dance.Draw specifically focused on creating visualizations based on the choreography of a dance for a single dancer controlling a single object, a single dancer controlling two objects, and three dancers controlling a single object. The merit of this system is its portability and accessibility: it uses three sets of two wireless computer mice and their USB receivers, a Mac computer, and a projector; it also costs around $1000, which is much cheaper any other system of its type.

Dr. Latulipe had exhibited this system three times in 2008, each time in a different environment with different restrictions on the display of the choreography and the display of the visualizations. It was well received by both the audience and the dancers at each venue. This systems holds promising potential for future research in the area, according the the choreographer. Some areas that may be explored further include mouse-related choreography, i.e. learning how to deal with the fact that the dancers are holding mice, and alternative sensors to make the performance more organic.

Discussion

I found this application of technology to the field of art very interesting. We have electronic music as a performance art, and graphic design as an artistically-influenced field, but dance has never really presented itself as desirous of the option for technology interaction. To be fair, if there are laser shows at rock concerts, why can there not be visualizations based on a dancer's movements at a dance exhibition? It will be fun to see what doors will be opened by the inclusion of different types of sensors and the accuracy and precision with which the sensors detect the dancers' movements. We could project live-action fantasy movies with real-time renderings of monsters and such... Exciting :)

Paper Reading #5: A Multi-Touch Enabled Steering Wheel - Exploring the Design Space

Commentary

See what I have to say about Brian's and Pape's work.

References

Pfeiffer, M., et al. (2010). A multi-touch enable steering wheel - exploring the design space. Proceeding of the Acm conference on human factors in computing systems (pp. 3355-3360). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Pfeiffer and his associates sought the extend the concept of steering wheel controls and generalize them with a multi-touch interface. They were motivated by the complex "infotainment" systems in most modern cars and their often distracting interfaces, and by the desire to extend previous work in this area. They created a working model of an input device that would take intuitive commands from the driver while allowing the driver to focus on the road and his driving.

The model steering wheel; input areas are indicated in white.

The driver was placed in a simulator that allowed him to experience the interface while actually keeping a car on the road and avoiding obstacles. He was asked to complete tasks such as "start playing music" or "open the navigation system" via thumb gestures, using either one or both thumbs. He did not have to search for the proper button on the steering wheel: he was simply asked to create a gesture that he felt would complete the task at hand. The researchers found that many of the same mental models already in place for other touch interfaces were utilized by the drivers to complete tasks, e.g. drawing a triangle to play music or using the pinch and pull techniques to zoom in and out of a map. They hope to extend the work to allow users to create customizable gesture sets and interfaces and be able to load them into other steering wheels in other cars.

Discussion

Alright, this is pretty cool. I am such a believer in the superiority of intuitive, touch-based interfaces over the standard push-button style stuff that we do these days. I personally believe that it just makes more sense, to interact with an interface in a physical manner like we would expect to interact with real objects in the real world. I do believe that some standard push-button type interfaces do have their place in the world, but I would just like to see more gesturing and tactile interaction going on.

For example, I think that part of the iPhone's initial success in the market was due to their interface. You don't have to scroll around the screen with a ball or wheel or button - just touch want you want to happen and it happens. Sure, touch screens and tablets and portable devices had been around forever, but the iPhone was practical and intuitive, head and shoulders above the competition. It will be a very short while before we see these kinds of interfaces popping up all over the place... I just don't know where yet... Sure, we might not replace the "Play" and "Eject" buttons on our Bluray players, and we probably will leave the standard up-down buttons or slide lever on our thermostats, but there has to be something out there that would benefit from a touch screen!

Paper Reading #4: There's A Monster In My Kitchen: Using Aversive Feedback to Motivate Behaviour Change

Commentary

See what I have to say about Keith's and Adam's work.

References

Kirman, B., et al. (2010). There's a monster in my kitchen: using aversive feedback to motivate behaviour change. Proceeding of the Acm conference on human factors in computing systems (pp. 2685-2694). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Kirman and his colleagues seek to incorporate some of the discoveries of behavioral psychology into a project in HCI that they call the Nag-baztag. The idea behind this system is to not only monitor power usage but provide feedback to the user in what they hope to be a more psychologically efficacious manner. The team outlines the concepts of positive and negative reinforcement, and how they will present each stimulus to the users of this system. In particular, the team aims to focus on aversive stimuli, such as punishment, for incorrectly performed behaviors or inconsiderate power usage. They address a concern about the applicability of a generalized psychological approach to different users by making their system adaptive, in that it will compare power usage statistics and how it thinks the user should respond based on the stimuli he has received and try to adjust its tactics to gain the desired result.

The system will deliver its stimuli to the user in the form of verbal comments, mostly as punishment for improper or imprudent use of power to perform everyday tasks in a kitchen. All electronic devices and even the sink will be monitored for usage. The system will "nag" the user for their poor choices in power management, and can interact with the user via Facebook, Twitter, or SMS. The system will be given enough control over the environment so as to be able to restrict the use of devices which have a history of improper use, or even take actions that will deliver negative consequences to the user in a real-life situation, such as not allowing the stove or the faucet to be used.

Discussion

I think that this is an interesting concept that might only see a very limited market. If you are given control over whether or not the system is in the house, what is stopping you from just pulling the plug when you get tired of hearing it complain? Granted, you never would have bought the system in the first place if you were not actually planning on using it, but it seems like it would take a very determined and motivated person to deal with not being able to use their sink or stove because of some angry appliance to not shut it off.

An angry refrigerator.

That being said, it is curious why they chose to go with punishment over reward. Research has shown positive reinforcement to be much more efficacious in creating lasting behavioral changes over punishment. This is not to say that a little reminder every now and again would not be warranted, e.g. "You used too much water in the kettle this time." I understand their desire not to focus directly on negative reinforcement for the obvious reason, but maybe if a more holistic analysis of the space was taken into account, they would be able to streamline every aspect of the environment and remove the need for an adaptive system. For example:

Positive reinforcement: praise for using the optimal amount of water in the kettle for a cup of tea
Negative reinforcement: a buzzer sounds continuously while the kettle is left on too long until it is removed
Punishment: scolding for using too much water in the kettle
Omission: the stove will not heat up as fast as a result of consistently using too much water

This sort of thinking could be applied to the whole system without too much more overhead as far as implementation goes (as far as I can tell, in any case). It seems the user might respond better to something like this that only to punishment.