CSCE 436: Human-Computer Interaction: April 2011

26 April 2011

Paper Reading #20: iSlideshow: a Content-Aware Slideshow System

Commentary

See what I have to say about ___'s and ___'s work.

References

Chen, J., Xiao, J., and Yuli, G. (2010). islideshow: a content-aware slideshow system. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

This paper details a presentation system that groups and transitions based on content rather than on by some arbitrarily defined effect. Content-based grouping allows the system to create one larger image from multiple smaller images that are seamlessly tiled together. The researchers implement a comparison algorithm that ensures a good flow of color and content from the edge of one image to another, building the entire scene iteratively:

Image courtesy of the above-cited article.

The researchers also use facial recognition to generate transitions that are relevant to the scene. Once photos are grouped by similar content with respect to facial recognition, transitions take into account the positions of faces between various photos in the group and build transitions based on those locations.

The results of a user study conducted by the researchers show that their user base enjoyed using their system more than others based on aesthetic appeal and fun. Several users mentioned that the slideshows seemed more meaningful with the content-aware transitions.

Discussion

I feel like this is a pretty innovative use of a content-aware system for manipulating photos and such. The fact that it was so well received by the users included in the study leads me to believe that this type of system would be very successful in a mainstream market. I personally would like to try it out! I feel like the current success of arbitrary transition effects and vanilla slideshow presentations is due largely in part to the fact that they are just so easy to implement, and any of the more impressive effects, like the onces detailed in this article, are viewed as being too difficult for anything but a Photoshop poweruser or something similar. I'm glad to see the steps that these researchers took with respect to interface design and making the creation of aesthetically pleasing presentations fun and exciting.

Paper Reading #15: TurKit: Human Computation Algorithms on Mechanical Turk

Commentary

See what I have to say about ___'s and ___'s work.

References

Little, G., et al. (2010). Turkit: human computation algorithms on mechanical turk. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researcher describe TurKit, a scripting environment for Amazon's Mechanical Turk (MTurk). MTurk allows people to post Human Intelligence Tasks (HITs) as jobs, for which MTurk Workers (Turkers) may be paid a few cents or more, depending on the difficulty of the task and the amount of time taken to complete the task. Some tasks include adding tags or descriptions to images, organizing images based on their content, or voting on the most descriptive or grammatically correct passage of writing in a set. MTurk provides an API that allows users to interface directly with the job creation system. The TurKit environment helps people who post jobs to automate the task of posting jobs and process iterative jobs as well.

Image courtesy of the above-cited article.

It utilizes the crash-and-rerun programming paradigm, in which a script is run until it crashes or is terminated, and then restarts from the beginning. The key advantage provided by TurKit is that posting jobs costs money, and with this system, posters can be ensured of not reposting tasks that have already been completed, thereby saving money. TurKit also provides for the automation of parallelized tasks as well, a key advantage of MTurk.

Discussion

I'm such a huge fan of automation. Parallelism is an added plus, which I'm hoping is something that will start to catch on here in the not-too-distant future. I also like when system maintainers publish an API so that users with that DIY ethic can tinker around and built something that is incredibly useful, if not to the general population then at least for themselves. In short, I like everything I've heard surrounding MTurk and TurKit.

I think that the thing I'm most excited about with respect to this article is the practicality of TurKit. It takes advantage of all of the best parts about MTurk and makes them more accessible. It's a simple design that is very well executed. Nice work :)

Paper Reading #14: A Framework for Robust and Flexible Handling of Inputs with Uncertainty

Commentary

See what I have to say about ___'s and ___'s work.

References

Schwarz, J., et al. (2010). A framework for robust and flexible handling of inputs with uncertainty. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This paper details a system that handles uncertain or ambiguous input. The researchers have devised a system that extends the conventional input system. Whereas in a conventional system, a user action either causes or does not cause a system action, in the uncertain system, all possible actions are taken into account and the most probable action is chosen. Actions do not cause final, irreversible changes to the system until temporary feedback is given to ascertain the intended input or the inferred action crosses a certain probability threshold. In the event that an action does not cross the "most probable" threshold, the user is given feedback of certain types when performing an action and may alter the action to generate the desired response. One possible temporary feedback type is detailed in the images below. As the user tries to select one slider, both are accidentally activated. A conventional system would just select one slider regardless of user intention, or do nothing at all. The uncertain system selects both sliders, gives temporary feedback on the possible state, and then allows the user to correct their input before a finalized action is taken.

Image courtesy of the above-cited article.

This system can handle uncertainty with both graphical and text input, including activation of multiple tiny buttons; inexactly placed input for scrolling, resizing windows, and dragging and dropping icons; multiple interpretations for spoken input to text translation; and greater ease of use for people with motor impairments.

Discussion

As with most projects that are at least latently philanthropic in nature, I really enjoyed this paper. For starters, gently correcting for erroneous input seems like a great idea, and it seems that this system does this without generating a large amount of overhead. We are already used to word processors automatically correcting our commonly misspelled words, or Google showing us results for what they think we really meant to search. Second, their results show a high rate of success in increasing ease-of-use for motor-impaired individuals, which is awesome. Admittedly, automatic "corrections" or "suggestions" can sometimes be pretty annoying; from what I've read, this system seems to strike a good balance.

Paper Reading #13: Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

Commentary

See what I have to say about ___'s and ___'s work.

References

Patel, K., et al. (2010). Gestalt: integrated support for implementation and analysis in machine learning. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presents a machine-learning approach to debugging with their system, Gestalt. They utilize a gesture recognition system to check for bugs in source code and issues where the system recognizes or fails to recognize a gesture.

Discussion

In all honesty, this paper was completely incomprehensible to me.

Image courtesy of Queen Michelle

Paper Reading #12: Pen + Touch = New Tools

Commentary

See what I have to say about ___'s and ___'s work.

References

Hinckley, K., et al. (2010). Pen + touch = new tools. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researchers examine the capabilities of a multimodal interface with touch- and pen-based interaction. They believe that interactions can be separated into unimodal and multimodal categories based both on input device and intended action.

In the first phase of their project, the researchers conducted an initial study where subjects used pen, paper, and various tools to organize objects in a notebook. They took note of common actions that all subjects performed, and what affordances a fully manipulatable environment granted them. Some examples that helped the researchers categorize different unimodal and multimodal commands include the subject tucking the pen in the fingers to manipulate objects in the environment, using only fingers to hold down or reposition objects, or using objects as part of the environment with one hand while drawing with the other.

Using their observations from the first phase, they designed an interface using the Microsoft Surface system that incorporates as many natural affordances from the first phase into the second: a multimodal pen-and-touch interface. Touch and pen input can be unimodal and have their own affordances in these cases; when the input is combined, i.e. multimodal, the context of the interactions changes and a new set of affordances become available. They categorize this difference as such: "...the pen writes, and touch manipulates, period." Some of the combined, i.e. multimodal, interactions included: holding objects together and tapping with the pen to "staple" them; holding an object steady and using the pen as an X-acto knife; holding an object steady and creating a "carbon copy" by dragging a copy off of it with the pen; and holding an object steady and using it as a straightedge along which to draw with the pen.

Using an object in the scene as a straightedge.
Image courtesy of the above-cited paper.

Discussion

This interface is even more intuitive than the last one I reviewed! And I love it! I appreciate the work that the researchers put into observing natural interactions with the type of environment that they wanted to create. This seems the be the smartest way to make an interface parallel interaction in the real world, and indeed, to allow an interface to achieve its maximum usability potential. The way they separated out the roles of touch and pen was ingenious. All in all, this is one of the best designs for a new interface I have seen throughout these papers. I'm almost as excited about this interface as I am about the Minority Report-style interface :)

Paper Reading #11: Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

Commentary

See what I have to say about ___'s and ___'s work.

References

Zelenik, R., et al. (2010). Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

The researchers for this paper present a fusing of two technologies that seem to complement each other well: pencil-and-paper mathematical calculations and a Computer Algebra System (CAS). They argue that pencil-and-paper provides a greater degree of freedom with respect to spatial interaction and is more intuitive because of the physical method of input. They do not discount the usefulness of a CAS in solving complex equations much more quickly and efficiently than could be done by hand, however. They have created a system which uses the Microsoft Surface system which combines these two interactions and interfaces, along with multitouch capabilities and multimodal input, i.e. light pen and fingers.

The researchers detail the capabilities of their system in depth. The system affords the user the ability to create and delete pages, pan across the virtual tabletop, access context-specific menus, and influence the context of gestures dependent on the combination of pen and touch inputs utilized. The researchers also implemented preexisting Software Development Kits (SDKs) to complete the mathematical operations. Overall, the results from a user study they conducted support their hypothesis, but offer many suggestions for future expansion as well. Specifically, users detailed what functionality they would have liked to see added to pages, e.g. growing or shrinking pages, or having data on one page accessible to all pages; and issues regarding the necessity of having both hands free for gesture input.

Discussion

I really like this design concept. I feel the exact same way as the researchers do about the affordances of pencil-and-paper and CAS. Personally, I can't wait until this happens:

Image courtesy of Niobium Labs

There is something that is just so inherently intuitive about touch interfaces: every interaction between ourselves and out=r physical world is through some form of manual manipulation. I'm glad that these types or interfaces are becoming more mainstream. That is all.

Paper Reading #10: PhoneTouch: A Technique for Direct Phone Interaction on Surfaces

Commentary

See what I have to say about ___'s and ___'s work.

References

Schmidt, D., et al. (2010). PhoneTouch: A technique for direct phone interaction on surfaces. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presented a novel interaction scheme utilizing a mobile phone as a stylus in conjunction with a finger, with interaction-dependent responses based on whither a phone or a finger was used. The interface was a tabletop touchscreen that connected to a phone or several phones via Bluetooth and discriminated between phone and finger touches via impact size and accelerometer data that was sent by a device attached to the phone. The image below depicts this setup:

Image courtesy of the above-cited article.

The idea behind this design is that interaction with the interfaces changes context when a phone is used over a finger. For example, a user can open up their photo album, touch the interface with their phone, and have the photos displayed on the interface. They can then move images around, examine them, and separate them into groups, and then transfer those images to another phone by tapping on the group with the phone. In general, the results coincide with what the researchers were hoping to achieve.

Discussion

This is an interesting concept, and I'd like to see it implemented in the real world, but I believe that will only happen once a device like the interface becomes more multipurpose, and not to mention mainstream. There are already table-type touch interfaces, but they are not very widespread. The benefit of a personal computer is that they are prolific and multipurpose, i.e. one can do more than just interact with photos and files on a phone. The cost of this type of system would make it prohibitive to own except for possibly a gadget junkie until the above issues are addressed. Really neat concept though :)

Paper Reading #8: Communicating Software Agreement Content Using Narrative Pictograms

Commentary

See what I have to say about ___'s and ___'s work.

References

Kay, M., and Terry, M. (2010). Communicating software agreement content using narrative pictograms. Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

These researchers set out to address several common issues with software agreements, such as End User License Agreements, and their reception by a user base. A few key issues that they cite are the amount of text and its relative reading level, the fact that some agreements are only localized to several locations, and the success of pictorial communication of ideas and the absence of its use with respect to software agreements.

The researchers set up a system of pictograms depicting how content and usage information would be collected from an image editing program. They utilized four sets of diagrams, each depicting a different type of data that would be collected and how it would be used. Below is one such diagram with the explanatory text removed, as one of the subjects of the study might have scene and have had to interpret:

Image courtesy of the above-cited article.

In general, the results were promising: their initial test group had some difficulty identifying the concepts outlines by the pictograms, but with some slight modifications, the rest of the subjects did sufficiently well. Having basic text explaining the images did much to increase understanding of the images, and was much easier to read than the software agreement itself

Discussion

I enjoyed the concept of pictorial descriptions of license agreements, but I doubt that the licenses will ever go away entirely. They are obviously necessary for legal reasons, and in some instances a user will trudge through an agreement out of boredom or curiosity or out of a genuine desire to know his limits and freedoms. A main issue that the researchers will have to overcome is comprehension of the pictures. The licenses themselves, while sometimes inherently incomprehensible due to the high level of "legalese," still provide an accurate and complete description of the responsibilities of both the user and the company providing the software. The pictures, while only meant to "augment the text," as the researcher put it, will still have to provide an accurate description of the text. I can only imagine the problems that would arise, what with this country's affinity for frivolous lawsuits and the like:

Image courtesy of Natalie Dee

Paper Reading #7: Hard-To-Use Interfaces Considered Beneficial (Some of the Time)

Commentary

See what I have to say about ___'s and ___'s work.

References

Riche, Y., et al. (2010). Hard-to-use interfaces considered beneficial (some of the time). Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

This group of researchers took an entirely novel approach to evaluating their systems for usability. They examine how something that would initially have been considered a bug or a barrier to usability from their perspective and instead treated it as a feature.

Image courtesy of The Geek Whisperer.

In their first study, the researchers placed a group of computer scientists in a collaborative design environment, who discovered a bug that linked their interaction with the interface to each other's actions. The solution that the group implemented themselves was to increase social interaction outside of the system, e.g. waiting for someone else to finish a task before beginning their own, or asking the group to pause so that they could complete their task. The results of this collaboration is that this group that had to deal with the "bug" had a higher ratio of satisfaction with the system because interacting allowed them to make fewer errors related to each other's actions.

In the second study, the researcher questioned a group of older individuals to gauge their satisfaction with new interfaces that are designed to make interacting with technology easier. The results of this examination, however, proved that the group did not value correspondence that was generating with the help of technology as much as they valued the same that was generated by hand. For this group, apparent effort equated to higher value. The researchers' proposed solution to this problem was to make an interface explicitly harder to use to increase the explicit value of messages generated by the interface.

Discussion

I was really excited to see the evaluation of this specific issue, a "bug" becoming a "feature" instead of being immediately fixed. It obviously worked out well in these cases, and speaks volumes about our ability as a species to overcome adversity. I know that sounds really cheesy, but it's true. Our society is all about instant gratification: I want this and I want it now! Things like this force us to take a step back and actually embrace our humanity rather than push it to the wayside. For some definition of the word "humanity," that is... Now this isn't to say that I don't appreciate the fast response of Google or the wide knowledge base the something like Netflix or Pandora utilizes to generate content and so on. I like those services for what they are: tools. Knowing that they exist, would I be disappointed if I couldn't use them anymore? Yeah, I would. Would I mourn their loss like the loss of a friend? Absolutely not. I don't know if there's anyone out there who would freely admit that they would, but take a look at how "constantly connected" we are to our technology. If you've ever done it, you know how nice it is to unplug every once in a while.

Image courtesy of Trip Advisor

25 April 2011

Paper Reading #24: A Natural Language Interface of Thorough Coverage by Concordance with Knowledge Bases

Commentary

See what I have to say about ___'s and ___'s work.

References

Han, Y., et al. (2010). A natural language interface of thorough coverage by concordance with knowledge bases. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

In this paper, Han et al. discuss a novel approach to solving a common problem with natural language interface (NLI) systems. As illustrated below, of the total set of expressions a user might possibly input, there is a partial disparity between those expressions that a given system can interpret and those which a given knowledge base can answer.

Image courtesy of the above-cited article.

Whereas most NLI systems try to make up the difference by expanding the number of expressions interpretable by the system, this team proposed to generate the interpretable expressions from the expressions answerable by the knowledge base instead. They identify several levels of classification based on a graph representation of the knowledge base, and generate all queries for a given level that are answerable. They then cast user expressions as one of the evaluable expressions by a similarity measure.

Discussion

I don't know how ~~many~~ any NLI systems work, but honestly, this approach, while (purportedly) novel, does not seem like a great feat of science and research. Maybe it is, I don't know: like I said, I know nothing about NLI systems. But I feel like I might possibly have stumbled upon this concept if you gave me the image above and described the problem to me. I'm just saying. That being said, was 2010 really the first time that anyone has thought to cast the problem like this, or was that just the first time anyone had thought to publish a paper on the topic? It just really seems like this specific problem is one that computer science as an all-encompassing entity would have solved a long time ago. Please, rebuke me, correct me, enlighten me if you can.

19 April 2011

Paper Reading #25: Using Language Complexity to Measure Cognitive Load for Adaptive Interaction Design

Commentary

See what I have to say about ___'s and ___'s work.

References

Khawaka, M. A., Chen, F., and Marcus, N. (2010). Using language complexity to measure cognitive load for adaptive interaction design. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

The researchers addressed the problem of adaptive interfaces in this paper, specifically adaptive interaction design. They proposed a method of controlling the level of adaptation of an interface via the monitoring of speech patterns. Different patterns were mapped to different levels cognitive load, or how much of the brain's processing power is being used to compute a task. Current speech recognition capabilities make real-time implementation infeasible; instead, the researchers performed quantitative analysis on transcriptions of several training exercises at an Australian bushfire management facility. They measured semantic difficulty, or the use of words, and syntactic complexity, or sentence length. Their hypothesis proved correct save for one prediction. Using statistical analysis, they were able to apply different language complexity measures to accurately map sentences used in different situations to the appropriate level of cognitive load for that situation.

For lack of a more salient picture, here is a table of their results.

As speech recognition capabilities become more advanced, they hope to test their implementation in a real-time situation.

Discussion

I really enjoyed this paper. Sometimes, I think I should have gone into English, specifically linguistics. Don't get me wrong: I like to chill out a little bit in this here discussion section, but I feel that crafting a literary piece or a technical document is akin to creating a work of art. Languages are so easy to use in a practical sense, but they can be wonderfully complex if they are truly understood. These researchers understand language (or at least how to analyze it). What's more is that they understand the importance of language and communication. Once speech recognition is up to par with the vision that these researchers have laid out, our interactions with our machines will be as seamless as our interactions with each other. Technology is a tool, and nothing more, but capable, efficient tools only make sense; why would we not strive to reach the limits of our potential to create and design?

07 April 2011

Paper Reading #9:Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback

Commentary

See what I have to say about ___'s and ___'s work.

References

Gustafson, S., Bierwirth, D., and Baudisch, P. (2010). Imaginary interfaces: spatial interaction with empty hands and without visual feedback. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this article, a system called Imaginary Interface is described. It focuses on screen-less wearable devices, presents the setup and results of a user study conducted by the researchers, and details a mock up of such a device. The idea behind this interface is to have the user create a "screen context," in which they operate as if they were using a device providing spatial feedback:

Image courtesy of the above referenced article.

All of the spatial information is contained in the user's mind, however; all interaction is done relative to the user's frame of reference. Three user studies were conducted: the user drew commons shapes and letters multiple times, and the variation between consecutive drawings was analyzed; the user drew an image and then was required to identify a specific point on that image, both with and without changing their frame of reference, i.e. standing still or turning, respectively; and the user identified a point in a coordinate system, the units of which were in lengths of digits, i.e. (2,1) referred to two thumbs right and one finger up. The results were very interesting, casting this type of device as something that could be practically realized. Finally, the researchers proposed a possible design for the device, which works by illuminating the user's hands with infrared light, applying a luminance threshold, and discerning the structures that comprise the imaginary interface:

Image courtesy of the above referenced article.

Discussion

First of all, to say nothing of the subject of the paper itself, this was the most straightforward research I have yet encountered. The concept of the device was clearly laid out, motivation and previous work was well-presented, and the results of the user studies were very accessible. That being said, I want one. I am fascinated by the concept of wearable computing, and this just takes it one step further. I honestly believe that so many awkward interactions could be alleviated through the use of something like this, e.g. you're on the phone and you just cannot express some simple concept or idea, and if only you were able to just draw a simple diagram, or had a little whiteboard... I really like this concept. I think they did a great job at coming up with tangible results from creating a frame of reference to come back to without having to be in the same visual context, i.e. you can pull up your frame of reference anywhere you want by just popping up your thumb and forefinger. Great work :)

05 April 2011

Paper Reading #19: From Documents to Tasks: Deriving User Tasks from Document Usage Patterns

Commentary

See what I have to say about Shena's and Derek's work.

References

Brdiczka, O. (2010). From documents to tasks: deriving user tasks from document usage patterns. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, the researcher presents a novel approach to implementing a task management support system. A system such as this takes care of monitoring tasks of knowledge workers and clustering common tasks to increase productivity. The researcher argues that switching between multiple tasks is "expensive because each task requires some recovery time as well as the reconstitution of task context." This system is novel because it does not group documents based on title or content, both of which introduce privacy concerns. Instead, documents are applied a unique identifier and are filtered by their dwell time, or how long they have focus and are actively being accessed. Documents are then grouped by similarity via a spectral clustering algorithm. The proposed system had the additional benefit of not needing any user input whatsoever, a feature that is necessary for most other systems of its type. The system was evaluated over a period of a month, with observation days being non-contiguous. Normal knowledge workers were observed performing some commonly recurring tasks. The proposed system show a high level of effectiveness performing task grouping in comparison with similar systems.

Discussion

This paper seemed incredibly abstract to me. I'm sure that at least some of the industry-specific terms like "knowledge worker" and "recovery time" and "reconstitution of task context" must have meaning to someone out there, right? I just don't understand what the point of "task clustering" is supposed to be; what does it do? How does it increase productivity? To an lowly, uninitiated "luser" like me (NO I WILL NOT APOLOGIZE FOR THAT DON NORMAN), it feels like we're "promoting synergy" or some other dumb catch phrase:

Image courtesy of The Lonely Island and Oh! Ryan Kelley

I'm not knocking the paper, the author, or his work; he seems to have done a pretty stellar job. And I appreciate the novelty of his approach, not to mention the apparent success of his method. I just don't get it. At all.

02 April 2011

Paper Reading #16: Mixture Model based Label Association Techniques for Web Accessibility

Commentary

See what I have to say about Wesley's and Miguel's work.

References

Islam, M. A., Borodin, Y., Ramakrishnan, I. V. (2010). Mixture Model based Label Association Techniques for Web Accessibility. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

Islam et al. present a system they have created that extends the functionality of an assistive technology called screen reading. This technology utilizes text-to-speech to allow blind users to navigate websites by reading the content on webpages and descriptions of elements to them. A major impediment to the proper functioning of screen readers is the omission of labels for page elements and alternative text for images. Without proper labels, form elements can be misrepresented or not denoted at all. Without alternative text for images, transaction functionality for most websites is completely lost, as transactional dialogs are usually controlled through images, e.g. an "Add to cart" or "Checkout now" button:

Taken from the above referenced paper.

Even properly labeled items are sometimes not handled properly by the screen reader by virtue of the ambiguity of the HTML Document Object Model (DOM), e.g. labels for elements and the elements themselves being contained in different HTML table rows:

Taken from the above referenced paper.

The authors implemented a finite mixture model (FMM) to create contexts to which HTML elements and possible labels belong. Using these contexts, the FMM can also create labels for unlabeled elements with some accuracy and more correctly interpret labels for ambiguous objects. In evaluating their system, the authors observed a 76% success rate of correctly applying labels to their elements without prior training by their FMM and a 95% success rate with prior training when all elements were explicitly labeled. On a testing set without any labels, their FMM achieved an 81% success rate. In evaluating their system through a user study with two blind users who were proficient with screen reading technology, both blind users agreed that the FMM made interacting with webpages easier for them.

Discussion

Wow. I personally feel that this is some of the most incredible research I've discovered yet. The authors have created a system that solves a very practical problem, and solves it well. The idea of creating contexts from which to infer labels was ingenious, as was the systematic approach to evaluating documents geometrically. I have to commend them: the system they created seems to be entirely robust. In addition, while this work may have implications outside of catering to users who need assistive technology, I feel that there was at least some measure of philanthropic drive behind the project, purposefully or not. I approve of this work.

Paper Reading #17: Mobia Modeler: Easing the Creation Process of Mobile Applications for Non-Technical Users

Commentary

See what I have to say about Joshua's and Shena's work.

References

Baltagas-Fernandez, F., Hussman, H., and Tafelmayer, M. (2010). Mobia modeler: easing the creation process of mobile applications for non-technical users. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/.

Article Summary

In this article, the researchers present a system comprised of an abstraction model and its respective interpreter, called the Mobia Modeler. The aim of the system is to allow users without technical experience with respect to mobile platform programming to create practical applications with the modelling system. A usage scenario presented by the researchers involved a doctor creating an application that would monitor a patient's vital signs and alert the doctor or emergency officials in the event of abnormal or dangerous observations. The system produces two subsets of the modeling language: a Platform Independent Model (PIM) and a Platform Specific Model (PSM). Users create applications by linking components together via the graphical user interface, and the system's processor then translates the graphical model to platform-specific code based on the relationships between the components that were defined by the user. Put another way, the user writes and application in PIM, and the processor translates it to PDM. Participants included individuals from different fields and from different technical backgrounds, i.e. those with programming experience and those without. Evaluation revealed that the programmers rated the system higher than the non-programmers.

Discussion

As the authors stated, this isn't the first time something like this has been tried. For example, National Instruments has the LabView platform, which admittedly provides much more functionality and control than the Mobia Modeler, and is targeted at a much more technical crowd. Additionally, the Java programming language comes to mind in reference to platform-independent and -specific code.

A basic code example in LabView. Image courtesy of San Diego State University

That being said, I think the authors did a fantastic job of bringing these concepts together in a way that is accessible to users of all technical backgrounds. Indeed, it makes the most practical sens (to me) to implement the system in this way. Non-technical users will quickly come to learn how to control interactions between objects graphically, and the platform-independent and -specific paradigm makes this application marketable across a wide share of current mobile devices. It comes as no surprise to me that the programmers felt more at ease using this system: they already think in terms of the system. With a little more practice, however, I believe that the non-technical users would be up to speed with their more technical counterparts in no time.

Paper Reading #18: Evaluating the Design of Inclusive Interfaces by Simulation

Commentary

See what I have to say about Steven's and Evin's work.

References

Biswas, P., and Robinson, P. (2010). Evaluating the design of inclusive interfaces by simulation. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, Biswas and Robinson discuss their development of a simulator that evaluates usage scenarios for different assistive interfaces. Assistive interfaces refer to those interfaces designed to assist users who are physically impaired.

The Samsung Jitterbug, an example (sort of) of an assistive interface. Image courtesy of My Vision Aid, Inc.

Their study consisted of comparing the simulator's predictions of how long different tasks would take for users with various impairments against actual measured times for different users. The researchers identify text-search tasks and icon-search tasks, but specifically focus on icon-search tasks. The two subtasks tested were searching for an icon, and pointing and clicking on an icon. They varied the spacing between icons and font size for the icon captions. The participants in their test consisted of able-bodied individuals, individuals with vision impairments, and individuals with motor impairments. In computing the error in the simulator's prediction of how long the task would take, they found that the simulator accurately predicted task times with statistical significance.

Discussion

I am sure that there have been other systems similar to this that have been developed, but this is the first I have heard about such a system. This seems to be a natural extension of unit testing, where in this case the unit is the usability rather than the functionality of the interface itself. The obvious gain here is that, given the efficacy of this model in predicting performance, one need not waste the time and money to perform an actual user study on an interface: just input the impairment parameters and run the interface through the simulator. Of course, the interface would have to be codified as per the simulator's capabilities, i.e. one would need to know font size and distance between icons, but this seems like something that is a pretty important part of inclusive interface design anyway. What I'm saying is, it doesn't seem like it would be a big stretch to be able to set up an interface to be tested by this simulator. Here's an idea for future work: extend the simulator from processing an interface based on a flat screen to processing three-dimensional data. Simulate movement throughout an environment, say, a home? I'm a fan of optimization.