26 April 2011

Paper Reading #20: iSlideshow: a Content-Aware Slideshow System

Commentary

See what I have to say about ___'s and ___'s work.

References

Chen, J., Xiao, J., and Yuli, G. (2010). islideshow: a content-aware slideshow system. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

This paper details a presentation system that groups and transitions based on content rather than on by some arbitrarily defined effect. Content-based grouping allows the system to create one larger image from multiple smaller images that are seamlessly tiled together. The researchers implement a comparison algorithm that ensures a good flow of color and content from the edge of one image to another, building the entire scene iteratively:

Image courtesy of the above-cited article.

The researchers also use facial recognition to generate transitions that are relevant to the scene. Once photos are grouped by similar content with respect to facial recognition, transitions take into account the positions of faces between various photos in the group and build transitions based on those locations.

The results of a user study conducted by the researchers show that their user base enjoyed using their system more than others based on aesthetic appeal and fun. Several users mentioned that the slideshows seemed more meaningful with the content-aware transitions.

Discussion

I feel like this is a pretty innovative use of a content-aware system for manipulating photos and such. The fact that it was so well received by the users included in the study leads me to believe that this type of system would be very successful in a mainstream market. I personally would like to try it out! I feel like the current success of arbitrary transition effects and vanilla slideshow presentations is due largely in part to the fact that they are just so easy to implement, and any of the more impressive effects, like the onces detailed in this article, are viewed as being too difficult for anything but a Photoshop poweruser or something similar. I'm glad to see the steps that these researchers took with respect to interface design and making the creation of aesthetically pleasing presentations fun and exciting.

Paper Reading #15: TurKit: Human Computation Algorithms on Mechanical Turk

Commentary

See what I have to say about ___'s and ___'s work.

References

Little, G., et al. (2010). Turkit: human computation algorithms on mechanical turk. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researcher describe TurKit, a scripting environment for Amazon's Mechanical Turk (MTurk). MTurk allows people to post Human Intelligence Tasks (HITs) as jobs, for which MTurk Workers (Turkers) may be paid a few cents or more, depending on the difficulty of the task and the amount of time taken to complete the task. Some tasks include adding tags or descriptions to images, organizing images based on their content, or voting on the most descriptive or grammatically correct passage of writing in a set. MTurk provides an API that allows users to interface directly with the job creation system. The TurKit environment helps people who post jobs to automate the task of posting jobs and process iterative jobs as well.

Image courtesy of the above-cited article.

It utilizes the crash-and-rerun programming paradigm, in which a script is run until it crashes or is terminated, and then restarts from the beginning. The key advantage provided by TurKit is that posting jobs costs money, and with this system, posters can be ensured of not reposting tasks that have already been completed, thereby saving money. TurKit also provides for the automation of parallelized tasks as well, a key advantage of MTurk.

Discussion

I'm such a huge fan of automation. Parallelism is an added plus, which I'm hoping is something that will start to catch on here in the not-too-distant future. I also like when system maintainers publish an API so that users with that DIY ethic can tinker around and built something that is incredibly useful, if not to the general population then at least for themselves. In short, I like everything I've heard surrounding MTurk and TurKit.

I think that the thing I'm most excited about with respect to this article is the practicality of TurKit. It takes advantage of all of the best parts about MTurk and makes them more accessible. It's a simple design that is very well executed. Nice work :)

Paper Reading #14: A Framework for Robust and Flexible Handling of Inputs with Uncertainty

Commentary

See what I have to say about ___'s and ___'s work.

References

Schwarz, J., et al. (2010). A framework for robust and flexible handling of inputs with uncertainty. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This paper details a system that handles uncertain or ambiguous input. The researchers have devised a system that extends the conventional input system. Whereas in a conventional system, a user action either causes or does not cause a system action, in the uncertain system, all possible actions are taken into account and the most probable action is chosen. Actions do not cause final, irreversible changes to the system until temporary feedback is given to ascertain the intended input or the inferred action crosses a certain probability threshold. In the event that an action does not cross the "most probable" threshold, the user is given feedback of certain types when performing an action and may alter the action to generate the desired response. One possible temporary feedback type is detailed in the images below. As the user tries to select one slider, both are accidentally activated. A conventional system would just select one slider regardless of user intention, or do nothing at all. The uncertain system selects both sliders, gives temporary feedback on the possible state, and then allows the user to correct their input before a finalized action is taken.

Image courtesy of the above-cited article.

This system can handle uncertainty with both graphical and text input, including activation of multiple tiny buttons; inexactly placed input for scrolling, resizing windows, and dragging and dropping icons; multiple interpretations for spoken input to text translation; and greater ease of use for people with motor impairments.

Discussion

As with most projects that are at least latently philanthropic in nature, I really enjoyed this paper. For starters, gently correcting for erroneous input seems like a great idea, and it seems that this system does this without generating a large amount of overhead. We are already used to word processors automatically correcting our commonly misspelled words, or Google showing us results for what they think we really meant to search. Second, their results show a high rate of success in increasing ease-of-use for motor-impaired individuals, which is awesome. Admittedly, automatic "corrections" or "suggestions" can sometimes be pretty annoying; from what I've read, this system seems to strike a good balance.

Paper Reading #13: Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

Commentary

See what I have to say about ___'s and ___'s work.

References

Patel, K., et al. (2010). Gestalt: integrated support for implementation and analysis in machine learning. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presents a machine-learning approach to debugging with their system, Gestalt. They utilize a gesture recognition system to check for bugs in source code and issues where the system recognizes or fails to recognize a gesture.

Discussion

In all honesty, this paper was completely incomprehensible to me.

Image courtesy of Queen Michelle

Paper Reading #12: Pen + Touch = New Tools

Commentary

See what I have to say about ___'s and ___'s work.

References

Hinckley, K., et al. (2010). Pen + touch = new tools. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this paper, the researchers examine the capabilities of a multimodal interface with touch- and pen-based interaction. They believe that interactions can be separated into unimodal and multimodal categories based both on input device and intended action.

In the first phase of their project, the researchers conducted an initial study where subjects used pen, paper, and various tools to organize objects in a notebook. They took note of common actions that all subjects performed, and what affordances a fully manipulatable environment granted them. Some examples that helped the researchers categorize different unimodal and multimodal commands include the subject tucking the pen in the fingers to manipulate objects in the environment, using only fingers to hold down or reposition objects, or using objects as part of the environment with one hand while drawing with the other.

Using their observations from the first phase, they designed an interface using the Microsoft Surface system that incorporates as many natural affordances from the first phase into the second: a multimodal pen-and-touch interface. Touch and pen input can be unimodal and have their own affordances in these cases; when the input is combined, i.e. multimodal, the context of the interactions changes and a new set of affordances become available. They categorize this difference as such: "...the pen writes, and touch manipulates, period." Some of the combined, i.e. multimodal, interactions included: holding objects together and tapping with the pen to "staple" them; holding an object steady and using the pen as an X-acto knife; holding an object steady and creating a "carbon copy" by dragging a copy off of it with the pen; and holding an object steady and using it as a straightedge along which to draw with the pen.

Using an object in the scene as a straightedge.
Image courtesy of the above-cited paper.

Discussion

This interface is even more intuitive than the last one I reviewed! And I love it! I appreciate the work that the researchers put into observing natural interactions with the type of environment that they wanted to create. This seems the be the smartest way to make an interface parallel interaction in the real world, and indeed, to allow an interface to achieve its maximum usability potential. The way they separated out the roles of touch and pen was ingenious. All in all, this is one of the best designs for a new interface I have seen throughout these papers. I'm almost as excited about this interface as I am about the Minority Report-style interface :)

Paper Reading #11: Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

Commentary

See what I have to say about ___'s and ___'s work.

References

Zelenik, R., et al. (2010). Hands-on math: a page-based multi-touch and pen desktop for technical work and problem solving. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

The researchers for this paper present a fusing of two technologies that seem to complement each other well: pencil-and-paper mathematical calculations and a Computer Algebra System (CAS). They argue that pencil-and-paper provides a greater degree of freedom with respect to spatial interaction and is more intuitive because of the physical method of input. They do not discount the usefulness of a CAS in solving complex equations much more quickly and efficiently than could be done by hand, however. They have created a system which uses the Microsoft Surface system which combines these two interactions and interfaces, along with multitouch capabilities and multimodal input, i.e. light pen and fingers.

The researchers detail the capabilities of their system in depth. The system affords the user the ability to create and delete pages, pan across the virtual tabletop, access context-specific menus, and influence the context of gestures dependent on the combination of pen and touch inputs utilized. The researchers also implemented preexisting Software Development Kits (SDKs) to complete the mathematical operations. Overall, the results from a user study they conducted support their hypothesis, but offer many suggestions for future expansion as well. Specifically, users detailed what functionality they would have liked to see added to pages, e.g. growing or shrinking pages, or having data on one page accessible to all pages; and issues regarding the necessity of having both hands free for gesture input.

Discussion

I really like this design concept. I feel the exact same way as the researchers do about the affordances of pencil-and-paper and CAS. Personally, I can't wait until this happens:

Image courtesy of Niobium Labs

There is something that is just so inherently intuitive about touch interfaces: every interaction between ourselves and out=r physical world is through some form of manual manipulation. I'm glad that these types or interfaces are becoming more mainstream. That is all.

Paper Reading #10: PhoneTouch: A Technique for Direct Phone Interaction on Surfaces

Commentary

See what I have to say about ___'s and ___'s work.

References

Schmidt, D., et al. (2010). PhoneTouch: A technique for direct phone interaction on surfaces. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

This group of researchers presented a novel interaction scheme utilizing a mobile phone as a stylus in conjunction with a finger, with interaction-dependent responses based on whither a phone or a finger was used. The interface was a tabletop touchscreen that connected to a phone or several phones via Bluetooth and discriminated between phone and finger touches via impact size and accelerometer data that was sent by a device attached to the phone. The image below depicts this setup:

Image courtesy of the above-cited article.

The idea behind this design is that interaction with the interfaces changes context when a phone is used over a finger. For example, a user can open up their photo album, touch the interface with their phone, and have the photos displayed on the interface. They can then move images around, examine them, and separate them into groups, and then transfer those images to another phone by tapping on the group with the phone. In general, the results coincide with what the researchers were hoping to achieve.

Discussion

This is an interesting concept, and I'd like to see it implemented in the real world, but I believe that will only happen once a device like the interface becomes more multipurpose, and not to mention mainstream. There are already table-type touch interfaces, but they are not very widespread. The benefit of a personal computer is that they are prolific and multipurpose, i.e. one can do more than just interact with photos and files on a phone. The cost of this type of system would make it prohibitive to own except for possibly a gadget junkie until the above issues are addressed. Really neat concept though :)

Paper Reading #8: Communicating Software Agreement Content Using Narrative Pictograms

Commentary

See what I have to say about ___'s and ___'s work.

References

Kay, M., and Terry, M. (2010). Communicating software agreement content using narrative pictograms. Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

These researchers set out to address several common issues with software agreements, such as End User License Agreements, and their reception by a user base. A few key issues that they cite are the amount of text and its relative reading level, the fact that some agreements are only localized to several locations, and the success of pictorial communication of ideas and the absence of its use with respect to software agreements.

The researchers set up a system of pictograms depicting how content and usage information would be collected from an image editing program. They utilized four sets of diagrams, each depicting a different type of data that would be collected and how it would be used. Below is one such diagram with the explanatory text removed, as one of the subjects of the study might have scene and have had to interpret:

Image courtesy of the above-cited article.

In general, the results were promising: their initial test group had some difficulty identifying the concepts outlines by the pictograms, but with some slight modifications, the rest of the subjects did sufficiently well. Having basic text explaining the images did much to increase understanding of the images, and was much easier to read than the software agreement itself

Discussion

I enjoyed the concept of pictorial descriptions of license agreements, but I doubt that the licenses will ever go away entirely. They are obviously necessary for legal reasons, and in some instances a user will trudge through an agreement out of boredom or curiosity or out of a genuine desire to know his limits and freedoms. A main issue that the researchers will have to overcome is comprehension of the pictures. The licenses themselves, while sometimes inherently incomprehensible due to the high level of "legalese," still provide an accurate and complete description of the responsibilities of both the user and the company providing the software. The pictures, while only meant to "augment the text," as the researcher put it, will still have to provide an accurate description of the text. I can only imagine the problems that would arise, what with this country's affinity for frivolous lawsuits and the like:

Image courtesy of Natalie Dee
.

Paper Reading #7: Hard-To-Use Interfaces Considered Beneficial (Some of the Time)

Commentary

See what I have to say about ___'s and ___'s work.

References

Riche, Y., et al. (2010). Hard-to-use interfaces considered beneficial (some of the time). Proceeding of the Acm conference on human factors in computing systems (pp. 2705-2714). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

This group of researchers took an entirely novel approach to evaluating their systems for usability. They examine how something that would initially have been considered a bug or a barrier to usability from their perspective and instead treated it as a feature.

Image courtesy of The Geek Whisperer.

In their first study, the researchers placed a group of computer scientists in a collaborative design environment, who discovered a bug that linked their interaction with the interface to each other's actions. The solution that the group implemented themselves was to increase social interaction outside of the system, e.g. waiting for someone else to finish a task before beginning their own, or asking the group to pause so that they could complete their task. The results of this collaboration is that this group that had to deal with the "bug" had a higher ratio of satisfaction with the system because interacting allowed them to make fewer errors related to each other's actions.

In the second study, the researcher questioned a group of older individuals to gauge their satisfaction with new interfaces that are designed to make interacting with technology easier. The results of this examination, however, proved that the group did not value correspondence that was generating with the help of technology as much as they valued the same that was generated by hand. For this group, apparent effort equated to higher value. The researchers' proposed solution to this problem was to make an interface explicitly harder to use to increase the explicit value of messages generated by the interface.

Discussion

I was really excited to see the evaluation of this specific issue, a "bug" becoming a "feature" instead of being immediately fixed. It obviously worked out well in these cases, and speaks volumes about our ability as a species to overcome adversity. I know that sounds really cheesy, but it's true. Our society is all about instant gratification: I want this and I want it now! Things like this force us to take a step back and actually embrace our humanity rather than push it to the wayside. For some definition of the word "humanity," that is... Now this isn't to say that I don't appreciate the fast response of Google or the wide knowledge base the something like Netflix or Pandora utilizes to generate content and so on. I like those services for what they are: tools. Knowing that they exist, would I be disappointed if I couldn't use them anymore? Yeah, I would. Would I mourn their loss like the loss of a friend? Absolutely not. I don't know if there's anyone out there who would freely admit that they would, but take a look at how "constantly connected" we are to our technology. If you've ever done it, you know how nice it is to unplug every once in a while.

Image courtesy of Trip Advisor
.

25 April 2011

Paper Reading #24: A Natural Language Interface of Thorough Coverage by Concordance with Knowledge Bases

Commentary

See what I have to say about ___'s and ___'s work.

References

Han, Y., et al. (2010). A natural language interface of thorough coverage by concordance with knowledge bases. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/

Article Summary

In this paper, Han et al. discuss a novel approach to solving a common problem with natural language interface (NLI) systems. As illustrated below, of the total set of expressions a user might possibly input, there is a partial disparity between those expressions that a given system can interpret and those which a given knowledge base can answer.

Image courtesy of the above-cited article.

Whereas most NLI systems try to make up the difference by expanding the number of expressions interpretable by the system, this team proposed to generate the interpretable expressions from the expressions answerable by the knowledge base instead. They identify several levels of classification based on a graph representation of the knowledge base, and generate all queries for a given level that are answerable. They then cast user expressions as one of the evaluable expressions by a similarity measure.

Discussion

I don't know how many any NLI systems work, but honestly, this approach, while (purportedly) novel, does not seem like a great feat of science and research. Maybe it is, I don't know: like I said, I know nothing about NLI systems. But I feel like I might possibly have stumbled upon this concept if you gave me the image above and described the problem to me. I'm just saying. That being said, was 2010 really the first time that anyone has thought to cast the problem like this, or was that just the first time anyone had thought to publish a paper on the topic? It just really seems like this specific problem is one that computer science as an all-encompassing entity would have solved a long time ago. Please, rebuke me, correct me, enlighten me if you can.

19 April 2011

Paper Reading #25: Using Language Complexity to Measure Cognitive Load for Adaptive Interaction Design

Commentary

See what I have to say about ___'s and ___'s work.

References

Khawaka, M. A., Chen, F., and Marcus, N. (2010). Using language complexity to measure cognitive load for adaptive interaction design. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

The researchers addressed the problem of adaptive interfaces in this paper, specifically adaptive interaction design. They proposed a method of controlling the level of adaptation of an interface via the monitoring of speech patterns. Different patterns were mapped to different levels cognitive load, or how much of the brain's processing power is being used to compute a task. Current speech recognition capabilities make real-time implementation infeasible; instead, the researchers performed quantitative analysis on transcriptions of several training exercises at an Australian bushfire management facility. They measured semantic difficulty, or the use of words, and syntactic complexity, or sentence length. Their hypothesis proved correct save for one prediction. Using statistical analysis, they were able to apply different language complexity measures to accurately map sentences used in different situations to the appropriate level of cognitive load for that situation.

For lack of a more salient picture, here is a table of their results.

As speech recognition capabilities become more advanced, they hope to test their implementation in a real-time situation.

Discussion

I really enjoyed this paper. Sometimes, I think I should have gone into English, specifically linguistics. Don't get me wrong: I like to chill out a little bit in this here discussion section, but I feel that crafting a literary piece or a technical document is akin to creating a work of art. Languages are so easy to use in a practical sense, but they can be wonderfully complex if they are truly understood. These researchers understand language (or at least how to analyze it). What's more is that they understand the importance of language and communication. Once speech recognition is up to par with the vision that these researchers have laid out, our interactions with our machines will be as seamless as our interactions with each other. Technology is a tool, and nothing more, but capable, efficient tools only make sense; why would we not strive to reach the limits of our potential to create and design?

07 April 2011

Paper Reading #9:Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback

Commentary

See what I have to say about ___'s and ___'s work.

References

Gustafson, S., Bierwirth, D., and Baudisch, P. (2010). Imaginary interfaces: spatial interaction with empty hands and without visual feedback. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

In this article, a system called Imaginary Interface is described. It focuses on screen-less wearable devices, presents the setup and results of a user study conducted by the researchers, and details a mock up of such a device. The idea behind this interface is to have the user create a "screen context," in which they operate as if they were using a device providing spatial feedback:

Image courtesy of the above referenced article.

All of the spatial information is contained in the user's mind, however; all interaction is done relative to the user's frame of reference. Three user studies were conducted: the user drew commons shapes and letters multiple times, and the variation between consecutive drawings was analyzed; the user drew an image and then was required to identify a specific point on that image, both with and without changing their frame of reference, i.e. standing still or turning, respectively; and the user identified a point in a coordinate system, the units of which were in lengths of digits, i.e. (2,1) referred to two thumbs right and one finger up. The results were very interesting, casting this type of device as something that could be practically realized. Finally, the researchers proposed a possible design for the device, which works by illuminating the user's hands with infrared light, applying a luminance threshold, and discerning the structures that comprise the imaginary interface:

Image courtesy of the above referenced article.

Discussion

First of all, to say nothing of the subject of the paper itself, this was the most straightforward research I have yet encountered. The concept of the device was clearly laid out, motivation and previous work was well-presented, and the results of the user studies were very accessible. That being said, I want one. I am fascinated by the concept of wearable computing, and this just takes it one step further. I honestly believe that so many awkward interactions could be alleviated through the use of something like this, e.g. you're on the phone and you just cannot express some simple concept or idea, and if only you were able to just draw a simple diagram, or had a little whiteboard... I really like this concept. I think they did a great job at coming up with tangible results from creating a frame of reference to come back to without having to be in the same visual context, i.e. you can pull up your frame of reference anywhere you want by just popping up your thumb and forefinger. Great work :)

05 April 2011

Paper Reading #19: From Documents to Tasks: Deriving User Tasks from Document Usage Patterns

Commentary

See what I have to say about Shena's and Derek's work.

References

Brdiczka, O. (2010). From documents to tasks: deriving user tasks from document usage patterns. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, the researcher presents a novel approach to implementing a task management support system. A system such as this takes care of monitoring tasks of knowledge workers and clustering common tasks to increase productivity. The researcher argues that switching between multiple tasks is "expensive because each task requires some recovery time as well as the reconstitution of task context." This system is novel because it does not group documents based on title or content, both of which introduce privacy concerns. Instead, documents are applied a unique identifier and are filtered by their dwell time, or how long they have focus and are actively being accessed. Documents are then grouped by similarity via a spectral clustering algorithm. The proposed system had the additional benefit of not needing any user input whatsoever, a feature that is necessary for most other systems of its type. The system was evaluated over a period of a month, with observation days being non-contiguous. Normal knowledge workers were observed performing some commonly recurring tasks. The proposed system show a high level of effectiveness performing task grouping in comparison with similar systems.

Discussion

This paper seemed incredibly abstract to me. I'm sure that at least some of the industry-specific terms like "knowledge worker" and "recovery time" and "reconstitution of task context" must have meaning to someone out there, right? I just don't understand what the point of "task clustering" is supposed to be; what does it do? How does it increase productivity? To an lowly, uninitiated "luser" like me (NO I WILL NOT APOLOGIZE FOR THAT DON NORMAN), it feels like we're "promoting synergy" or some other dumb catch phrase:

Image courtesy of The Lonely Island and Oh! Ryan Kelley

I'm not knocking the paper, the author, or his work; he seems to have done a pretty stellar job. And I appreciate the novelty of his approach, not to mention the apparent success of his method. I just don't get it. At all.

02 April 2011

Paper Reading #16: Mixture Model based Label Association Techniques for Web Accessibility

Commentary

See what I have to say about Wesley's and Miguel's work.

References

Islam, M. A., Borodin, Y., Ramakrishnan, I. V. (2010). Mixture Model based Label Association Techniques for Web Accessibility. Proceeding of the Acm conference on user interface software and technology. New York: http://www.acm.org/uist/uist2010/.

Article Summary

Islam et al. present a system they have created that extends the functionality of an assistive technology called screen reading. This technology utilizes text-to-speech to allow blind users to navigate websites by reading the content on webpages and descriptions of elements to them. A major impediment to the proper functioning of screen readers is the omission of labels for page elements and alternative text for images. Without proper labels, form elements can be misrepresented or not denoted at all. Without alternative text for images, transaction functionality for most websites is completely lost, as transactional dialogs are usually controlled through images, e.g. an "Add to cart" or "Checkout now" button:

Taken from the above referenced paper.

Even properly labeled items are sometimes not handled properly by the screen reader by virtue of the ambiguity of the HTML Document Object Model (DOM), e.g. labels for elements and the elements themselves being contained in different HTML table rows:

Taken from the above referenced paper.

The authors implemented a finite mixture model (FMM) to create contexts to which HTML elements and possible labels belong. Using these contexts, the FMM can also create labels for unlabeled elements with some accuracy and more correctly interpret labels for ambiguous objects. In evaluating their system, the authors observed a 76% success rate of correctly applying labels to their elements without prior training by their FMM and a 95% success rate with prior training when all elements were explicitly labeled. On a testing set without any labels, their FMM achieved an 81% success rate. In evaluating their system through a user study with two blind users who were proficient with screen reading technology, both blind users agreed that the FMM made interacting with webpages easier for them.

Discussion

Wow. I personally feel that this is some of the most incredible research I've discovered yet. The authors have created a system that solves a very practical problem, and solves it well. The idea of creating contexts from which to infer labels was ingenious, as was the systematic approach to evaluating documents geometrically. I have to commend them: the system they created seems to be entirely robust. In addition, while this work may have implications outside of catering to users who need assistive technology, I feel that there was at least some measure of philanthropic drive behind the project, purposefully or not. I approve of this work.

Paper Reading #17: Mobia Modeler: Easing the Creation Process of Mobile Applications for Non-Technical Users

Commentary

See what I have to say about Joshua's and Shena's work.

References

Baltagas-Fernandez, F., Hussman, H., and Tafelmayer, M. (2010). Mobia modeler: easing the creation process of mobile applications for non-technical users. Proceeding of the Acm conference on intelligent user interfaces. Hong Kong: http://www.iuiconf.org/.

Article Summary

In this article, the researchers present a system comprised of an abstraction model and its respective interpreter, called the Mobia Modeler. The aim of the system is to allow users without technical experience with respect to mobile platform programming to create practical applications with the modelling system. A usage scenario presented by the researchers involved a doctor creating an application that would monitor a patient's vital signs and alert the doctor or emergency officials in the event of abnormal or dangerous observations. The system produces two subsets of the modeling language: a Platform Independent Model (PIM) and a Platform Specific Model (PSM). Users create applications by linking components together via the graphical user interface, and the system's processor then translates the graphical model to platform-specific code based on the relationships between the components that were defined by the user. Put another way, the user writes and application in PIM, and the processor translates it to PDM. Participants included individuals from different fields and from different technical backgrounds, i.e. those with programming experience and those without. Evaluation revealed that the programmers rated the system higher than the non-programmers.

Discussion

As the authors stated, this isn't the first time something like this has been tried. For example, National Instruments has the LabView platform, which admittedly provides much more functionality and control than the Mobia Modeler, and is targeted at a much more technical crowd. Additionally, the Java programming language comes to mind in reference to platform-independent and -specific code.

A basic code example in LabView. Image courtesy of San Diego State University

That being said, I think the authors did a fantastic job of bringing these concepts together in a way that is accessible to users of all technical backgrounds. Indeed, it makes the most practical sens (to me) to implement the system in this way. Non-technical users will quickly come to learn how to control interactions between objects graphically, and the platform-independent and -specific paradigm makes this application marketable across a wide share of current mobile devices. It comes as no surprise to me that the programmers felt more at ease using this system: they already think in terms of the system. With a little more practice, however, I believe that the non-technical users would be up to speed with their more technical counterparts in no time.

Paper Reading #18: Evaluating the Design of Inclusive Interfaces by Simulation

Commentary

See what I have to say about Steven's and Evin's work.

References

Biswas, P., and Robinson, P. (2010). Evaluating the design of inclusive interfaces by simulation. Proceeding of the Acm conference on intelligent user interfaces. . Hong Kong: http://www.iuiconf.org/.

Article Summary

In this paper, Biswas and Robinson discuss their development of a simulator that evaluates usage scenarios for different assistive interfaces. Assistive interfaces refer to those interfaces designed to assist users who are physically impaired.

The Samsung Jitterbug, an example (sort of) of an assistive interface. Image courtesy of My Vision Aid, Inc.

Their study consisted of comparing the simulator's predictions of how long different tasks would take for users with various impairments against actual measured times for different users. The researchers identify text-search tasks and icon-search tasks, but specifically focus on icon-search tasks. The two subtasks tested were searching for an icon, and pointing and clicking on an icon. They varied the spacing between icons and font size for the icon captions. The participants in their test consisted of able-bodied individuals, individuals with vision impairments, and individuals with motor impairments. In computing the error in the simulator's prediction of how long the task would take, they found that the simulator accurately predicted task times with statistical significance.

Discussion

I am sure that there have been other systems similar to this that have been developed, but this is the first I have heard about such a system. This seems to be a natural extension of unit testing, where in this case the unit is the usability rather than the functionality of the interface itself. The obvious gain here is that, given the efficacy of this model in predicting performance, one need not waste the time and money to perform an actual user study on an interface: just input the impairment parameters and run the interface through the simulator. Of course, the interface would have to be codified as per the simulator's capabilities, i.e. one would need to know font size and distance between icons, but this seems like something that is a pretty important part of inclusive interface design anyway. What I'm saying is, it doesn't seem like it would be a big stretch to be able to set up an interface to be tested by this simulator. Here's an idea for future work: extend the simulator from processing an interface based on a flat screen to processing three-dimensional data. Simulate movement throughout an environment, say, a home? I'm a fan of optimization.

24 February 2011

Paper Reading #6: Blowtooth: Pervasive Gaming in Unique and Challenging Environments

Commentary

See what I have to say about Ryan's and Chris's work.

References

Linehan, C., et al. (2010). Blowtooth: pervasive gamine in unique and challenging environments. Proceeding of the Acm conference on human factors in computing systems (pp. 2695-2704). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Linehan and his associates studied the concept of pervasive gaming, in which one utilizes the real world to fuel interactions in a virtual environment. In specific, they tried to map the success of pervasive games and their applicability to various environments. They created a game called Blowtooth, a virtual drug smuggling game in which players dumped virtual contraband onto unsuspecting "mules" in international airports to get their stash through security, only to meet up with the mule once past security to retrieve the stash. An airport was chosen for the context of the game because it is a readily accessible real environment to which the concept of the game relates well. For this reason, Blowtooth is also a critical game, one which encourages the player to think critically about the ethical and societal concerns of the game and its implications.

The game works by searching for the unique ID of discoverable Bluetooth-enabled devices, storing them on the player's device, and then enacting a wait period before the player is allowed to "retrieve their stash," i.e. rediscover previously discovered devices. Non-players have no other role in the game than to possess a discoverable Bluetooth-enabled device. The players were later questioned on how appropriate they felt the environment was to the game, their level of satisfaction, increased levels of awareness of security and other passengers, and anxiety. The concept of "it's just a game" influenced the low levels of anxiety and concern for security, but the game did succeed as a critical game as reported by the players.

Discussion

Mwahaha! I love this game! Seriously, I want to play it myself. I feel that this game has very subversive undertones, regardless of whether or not they were addressed, but I understand that subversion isn't the point of the game. Anything that integrates a virtual world with reality so well is an immediate success in my book, especially as far as HCI is concerned. I would think that the recent focus on cloud computing will be the next big widespread pervasive technology advance, and this game really brings to light some of the implications of pervasive technologies. How easy it is already to drag-and-drop a file on your desktop into your Dropbox and then pull it up on your mobile device which is attached to a projector in a presentation, and then send the file out to everyone in the room with basically the click of a button. What's next?

Games, like art, serve no practical purpose, other than to provide a tangible link to ideas, emotions, history, and the like. I mean, clearly this is not the first time that this idea has been explored:

Image courtesy of Alidade Incorporated

But it is the first time that the pervasive, real world element has been included in the mix. This does not mean that the importance of art can just be discounted; on the contrary, more care must be taken to create and preserve representations of the intangible than the tangible. Physical devices are tangible; our interactions with them can be tangible and intangible; how we are engaged and affected by those interactions is intangible.

Blog Entry #5: Dance.Draw

References

Latulipe, C. and Huskey, S. (2008). Dance.draw: exquisite interaction. Proceeding of HCI 2008 (pp. 47-51). Liverpool: http://www.bcs.org/category/14372.

Article Summary

Dr. Latulipe's Dance.Draw project is part of a project called Exquisite Interactions, in which she explores the interaction of an artist with technology as they create their art. Dance.Draw specifically focused on creating visualizations based on the choreography of a dance for a single dancer controlling a single object, a single dancer controlling two objects, and three dancers controlling a single object. The merit of this system is its portability and accessibility: it uses three sets of two wireless computer mice and their USB receivers, a Mac computer, and a projector; it also costs around $1000, which is much cheaper any other system of its type.

Dr. Latulipe had exhibited this system three times in 2008, each time in a different environment with different restrictions on the display of the choreography and the display of the visualizations. It was well received by both the audience and the dancers at each venue. This systems holds promising potential for future research in the area, according the the choreographer. Some areas that may be explored further include mouse-related choreography, i.e. learning how to deal with the fact that the dancers are holding mice, and alternative sensors to make the performance more organic.

Discussion

I found this application of technology to the field of art very interesting. We have electronic music as a performance art, and graphic design as an artistically-influenced field, but dance has never really presented itself as desirous of the option for technology interaction. To be fair, if there are laser shows at rock concerts, why can there not be visualizations based on a dancer's movements at a dance exhibition? It will be fun to see what doors will be opened by the inclusion of different types of sensors and the accuracy and precision with which the sensors detect the dancers' movements. We could project live-action fantasy movies with real-time renderings of monsters and such... Exciting :)

Paper Reading #5: A Multi-Touch Enabled Steering Wheel - Exploring the Design Space

Commentary

See what I have to say about Brian's and Pape's work.

References

Pfeiffer, M., et al. (2010). A multi-touch enable steering wheel - exploring the design space. Proceeding of the Acm conference on human factors in computing systems (pp. 3355-3360). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Pfeiffer and his associates sought the extend the concept of steering wheel controls and generalize them with a multi-touch interface. They were motivated by the complex "infotainment" systems in most modern cars and their often distracting interfaces, and by the desire to extend previous work in this area. They created a working model of an input device that would take intuitive commands from the driver while allowing the driver to focus on the road and his driving.

The model steering wheel; input areas are indicated in white.

The driver was placed in a simulator that allowed him to experience the interface while actually keeping a car on the road and avoiding obstacles. He was asked to complete tasks such as "start playing music" or "open the navigation system" via thumb gestures, using either one or both thumbs. He did not have to search for the proper button on the steering wheel: he was simply asked to create a gesture that he felt would complete the task at hand. The researchers found that many of the same mental models already in place for other touch interfaces were utilized by the drivers to complete tasks, e.g. drawing a triangle to play music or using the pinch and pull techniques to zoom in and out of a map. They hope to extend the work to allow users to create customizable gesture sets and interfaces and be able to load them into other steering wheels in other cars.

Discussion

Alright, this is pretty cool. I am such a believer in the superiority of intuitive, touch-based interfaces over the standard push-button style stuff that we do these days. I personally believe that it just makes more sense, to interact with an interface in a physical manner like we would expect to interact with real objects in the real world. I do believe that some standard push-button type interfaces do have their place in the world, but I would just like to see more gesturing and tactile interaction going on.

For example, I think that part of the iPhone's initial success in the market was due to their interface. You don't have to scroll around the screen with a ball or wheel or button - just touch want you want to happen and it happens. Sure, touch screens and tablets and portable devices had been around forever, but the iPhone was practical and intuitive, head and shoulders above the competition. It will be a very short while before we see these kinds of interfaces popping up all over the place... I just don't know where yet... Sure, we might not replace the "Play" and "Eject" buttons on our Bluray players, and we probably will leave the standard up-down buttons or slide lever on our thermostats, but there has to be something out there that would benefit from a touch screen!

Paper Reading #4: There's A Monster In My Kitchen: Using Aversive Feedback to Motivate Behaviour Change

Commentary

See what I have to say about Keith's and Adam's work.

References

Kirman, B., et al. (2010). There's a monster in my kitchen: using aversive feedback to motivate behaviour change. Proceeding of the Acm conference on human factors in computing systems (pp. 2685-2694). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Kirman and his colleagues seek to incorporate some of the discoveries of behavioral psychology into a project in HCI that they call the Nag-baztag. The idea behind this system is to not only monitor power usage but provide feedback to the user in what they hope to be a more psychologically efficacious manner. The team outlines the concepts of positive and negative reinforcement, and how they will present each stimulus to the users of this system. In particular, the team aims to focus on aversive stimuli, such as punishment, for incorrectly performed behaviors or inconsiderate power usage. They address a concern about the applicability of a generalized psychological approach to different users by making their system adaptive, in that it will compare power usage statistics and how it thinks the user should respond based on the stimuli he has received and try to adjust its tactics to gain the desired result.

The system will deliver its stimuli to the user in the form of verbal comments, mostly as punishment for improper or imprudent use of power to perform everyday tasks in a kitchen. All electronic devices and even the sink will be monitored for usage. The system will "nag" the user for their poor choices in power management, and can interact with the user via Facebook, Twitter, or SMS. The system will be given enough control over the environment so as to be able to restrict the use of devices which have a history of improper use, or even take actions that will deliver negative consequences to the user in a real-life situation, such as not allowing the stove or the faucet to be used.

Discussion

I think that this is an interesting concept that might only see a very limited market. If you are given control over whether or not the system is in the house, what is stopping you from just pulling the plug when you get tired of hearing it complain? Granted, you never would have bought the system in the first place if you were not actually planning on using it, but it seems like it would take a very determined and motivated person to deal with not being able to use their sink or stove because of some angry appliance to not shut it off.
An angry refrigerator.

That being said, it is curious why they chose to go with punishment over reward. Research has shown positive reinforcement to be much more efficacious in creating lasting behavioral changes over punishment. This is not to say that a little reminder every now and again would not be warranted, e.g. "You used too much water in the kettle this time." I understand their desire not to focus directly on negative reinforcement for the obvious reason, but maybe if a more holistic analysis of the space was taken into account, they would be able to streamline every aspect of the environment and remove the need for an adaptive system. For example:
  • Positive reinforcement: praise for using the optimal amount of water in the kettle for a cup of tea
  • Negative reinforcement: a buzzer sounds continuously while the kettle is left on too long until it is removed
  • Punishment: scolding for using too much water in the kettle
  • Omission: the stove will not heat up as fast as a result of consistently using too much water
This sort of thinking could be applied to the whole system without too much more overhead as far as implementation goes (as far as I can tell, in any case). It seems the user might respond better to something like this that only to punishment.

20 February 2011

Blog Entry #4: If I Were An Ethnography REDUX

After having talked through our ethnography ideas with Dr. Hammond and her friend (attribution to follow), we have a clearer idea of what we would like to propose as an ethnographical study.

Instead of observing only study habits on the first floor of Evans Library, we are planning on observing the first floor of Evans Library as a whole to see how people interact with in a space that has been traditionally set aside for study purposes.

The transformation of this space from one purely geared towards study to a more multi-modal space is made immediately apparent by the addition of an coffee bar at the front entrance. This had originally been a separate building, closed off from the library: now the walls have been opened and the entire first floor of the library has been revamped. There are plenty of tables at which to socialize or study in the coffee bar area; study couches, some with tables, just behind the coffee bar; and a new wing of modular study desks and individual chair and couches at various locations throughout.

Our original idea of observing study habits was much too restrictive, as Dr. Hammond's friend (attribution to follow) pointed out. How would we know if people were studying or not? Is Facebook an indicator of a study break or idle passing of time between classes? Should we be interrupting people who might be studying to ascertain the aim of their dubious activities? Instead, we are going to observe interactions in the library in order to evaluate the space and its culture, if you will, to better understand how the modern college student understands the first floor of the library. Some things we might observe could include:
  • How many people are alone, or in groups of two, three, or more?
  • How many people are interacting with technology, e.g. cell phone, portable music player, computer?
  • Does apparent group size have any influence on technology use, i.e. where is the interaction focused?
  • What are the levels of interaction in the different perceived areas of the space?
This study will allow us to interact in the environment while we observe, which will hopefully contribute to a more organic study rather than something more akin to an experiment. We will also be able to reflexively examine our own interaction in the space with each other and others as we observe.

All in all our goal is to try to enumerate the interactions on the first floor of Evans Library to discover how the space and its use has transformed from the traditional concept of a library.

27 January 2011

Paper Reading #3: The Coffee Lab: Developing a Public Usability Space

Commentary

See what I have to say about Jessica's and Joshua's work.

References

Karam, M. (2010). The coffee lab: developing a public usability space. Proceeding of the Acm conference on human factors in computing systems (pp. 2671-2680). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Karam outlines her project that explores usability in a public setting, called the Coffee Lab. The lab is set up in a local Toronto coffee shop and consists of several interactive systems with which Karam can conduct public usability tests (PUTs). The novel concept in this project is the fact that Karam conducts her studies outside of a traditional laboratory environment, affording her the opportunity to explore a wide user base in a more natural setting, in the hopes that she will get more salient results. Karam describes the project as ...[being] aimed at developing a permanent usability facility in a coffee shop, where different interactive systems are presented, evaluated, and experienced by anyone who enters the shop.

The Emoti-Chair in the Coffee Lab
Image courtesy of Sideshow Cafe

There are currently two interactive systems being tested at the Coffee Lab: the Emoti-chair and the iGesture. The infrastructure of the lab consists of several computers networked together with various webcams for data gathering and touchscreens for interaction. She evaluates the systems in five stages:
  • exposure, or first contact with the user;
  • experience, in which the user first learns about the system;
  • experiment, in which a more structured approach is taken towards gathering user input;
  • extension, which encompasses long-term study opportunities;
  • and exploration, after the user has become familiar with the system and has been subjected to various questionnaires and interviews, and is allowed to interact with the system in a completely unstructured way.
Karam goes on to cover some of the results she has gathered and to offer some improvements that will be made to the system as time goes on.

Discussion

This is as ingenious a concept as the CAT article I reviewed previously. A recurring theme I've been noticing in the field of HCI is that of the observing the interaction itself, and not just trying to come up with and test what we as the designer think is a creative idea. I guess the point I'm trying to make is that it's interesting to me to step away from design for a moment and focus on observation. If I could fault myself on one thing it might be the atrophy of my creative process, maybe through disuse, maybe through too narrow a scope of explorations. I can complete a project to a specification, and I can surely come up with some improvements to the specification I discover through the process of creating it, but I feel like employing some of these techniques from the field of HCI might help me fix my disability of not being able to discover my own novel interactions.

26 January 2011

Paper Reading #2: Early Explorations of CAT: Canine Amusement and Training

Commentary

See what I have to say about Evin's and Paola's work.

References

Wingrave, C. A., et al. (2010). Early explorations of cat: canine amusement and training. Proceeding of the Acm conference on human factors in computing systems (pp. 2661-2669). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Wingrave and his associates work in the field of CHCI, or Canine-Human-Computer-Interaction, in order to create an environment for meaningful interaction between humans and their canine companions using the help of computers. The system they created was designed to be used as an aid to effective canine training and also to teach the human how to develop appropriate habits for training and for play.

The proof of concept designed by the team comprised the setup of the system and three games: some basic training commands, a tag game, and a chase game. The system used Wiimotes from Nintendo to track canine movement and a combined TV-projector display (TV for the humans, projector pointed towards the floor for the dogs). Their initial results were reviewed by an animal training specialist, who gave them some insight as to the effects of both the games and the human participation on the dogs.

The team then altered the system to correct minor design flaws and developed new games that would provide a more efficacious training environment for the dogs: the new games focused on keeping both humans and dogs calm in various scenarios. Dogs learned to stay with and come to their owners, and to be placed in various locations in which they would stay.

The team is currently continuing to evaluate the responses of humans and canines alike to various changes in the system to provide a fun, usable system for CHCI. Their future plans include training tips based on observed human and canine behavior, the possibility of collaborative play, and eventually the distribution of this system on a widespread scale.

Discussion

I loved this article. As the new owner a very young, very small dog who came from a shelter, I definitely understand the necessity for meaningful interaction between man and his best friend. My dog, Marvin, is still skittish when there's a lot of excitement going on, doesn't like other humans very much, and doesn't respond well to my training attempts (granted, I haven't taken to time that I don't have anyway to do any serious training). This isn't to say that I've given up on him, but I imagine a system like this could do wonders to help the both of us understand each other better. This isn't the first attempt at incorporating animals into computer games to some degree of reasonable facsimile for the purposes of learning and training:

Image courtesy of IGN

But I think it's pretty revolutionary to extend the field of HCI to CHCI. At the same time, I think that users should take as much care with monitoring their pets' game time as they should their own children. I mean, we all know that these sorts of things can get out of hand:

Image courtesy of Fort 90

25 January 2011

Blog Entry #3: If I were an ethnography, I would be about...

If I weren't a computer science major, I would definitely have been a psychology major. Or a sociology major. Or an anthropology major. Basically, I like people, and more than that, I like interacting with people and seeing how people interact with each other, and with technology. Case in point: my ethnography topics.

IDEA #1: Free as in speech, or free as in beer?

Image courtesy of xkcd

This idea centers around that fact that I can be a pompous jerk sometimes I don't feel that many people out there know about OSS, either in and of itself or as an alternative to proprietary software, or if they do, don't know about the different degrees of open-ness. I aim to distribute a questionnaire about OSS, covering knowledge on:
  • Definitions of open, free, libre, and proprietary software
  • Types of open-source licenses
  • Open-source projects, ranging from common use to totally esoteric
  • and the differences between proprietary software and their open-source replacements
I feel that this exercise will 1) help me gain some perspective on the progress of open-source in today's technological age, and 2) open some eyes and win some hearts and minds over to the open-source initiative/side/philosophy/whatever.

IDEA #2: I like your hardware...

Image courtesy of iTech News Net

I'm just gonna come right out and say this, sometimes I get a little gear envy. I appreciate your full-frame DSLR with that great piece of glass on the front. I admire your well-tuned-but-not-overdone suspension and capable off-road tires on your Jeep. I commend the sensibility, usability, and quality of your Macbook Air. Oh, you didn't know what a great piece of technology you have there under your fingertips? I wonder how often this situation arises for the common folk for non-computer-related majors: you have a decent computer because your friend/dad/boss/salesman told you to buy it, but you don't really understand its capabilities. To really get a grasp on how much people understand about the personal technology, I'd like to pose the following questions, among others:
  • What's the make, model, and trim on your computer?
  • What were the considerations you made when purchasing it?
  • OR
  • What were the considerations someone made when purchasing it for you?
  • What are some of its main features? Processor cores/speed/caches, memory speed/amount, screen size and lighting, graphics capabilities, special peripherals, battery size/life, etc.
  • For what purpose do you mainly use your computer?
  • What is your overall satisfaction with your computer?
The purpose of this exercise is 1) to open my eyes to what "the common folk" (read: non-computer-related majors) know about the technology they rely on, and 2) to hopefully get them interested in learning about and looking into said technology.

IDEA #3: I'm going to go blog about this in my blog because I'm a blogger

Image courtesy of Gaping Void

Last idea: we blog a lot in this class. Some people have blogged before; some never have. Some people know what a blog is, but have no idea where that word came from. Some people read blogs religiously; some use them as a valuable resource for information and opinions, but not much else. The fact of the matter is that blogs are available to pretty much everyone, everywhere. I want to find out how involved or concerned people are with blogs:
  • Do you know where to words blog and vlog come from?
  • Do you read blogs/watch vlogs? How many? With a program or on the main site?
  • How do you use the information in the blogs you read/vlogs you watch?
  • Do you have a blog/vlog? Who serves it? How many readers/viewers do you have?
  • Do you find blogging/vlogging to be a worthwhile endeavor? Why or why not?
  • Do your friends and family find blogging/vlogging to be a worthwhile endeavor? Why or why not?
The purpose of this exercise is [see above].

People are just so much fun :)

23 January 2011

Blog Entry #2: On Computers

Commentary

See what I have to say about Shena's and Vince's work.

References

Aristotle. (1994). On plants. In J. Barnes (Ed.), The complete works of aristotle (pp. 1252-1271). Princeton, NJ: Princeton University Press.

On Computers

Intelligence and understanding is found in humans and computers; but while in some humans it is clearly manifest, in computers it is programmed and unnatural. For before we can assert the presence of true understanding in computers, a long inquiry must be held as to whether computers possess a soul a true capacity to understand. This inquiry, however, will not be performed here.

Some computers contain within them graphics processing units integrated into the motherboard. Some have stand-alone or third-party graphics processing units. Some have even two or three, and maybe one is used independently as a PhysX engine. Some components of the computer are simple, such as a radiated heatsink-cooled northbridge and an air-cooled southbridge; some are more complex, like a thermoelectrically-cooled processing core. Computers possess other various parts as well: SATA cables, power supplies, and peripheral ports.

Just as in the human, so also in the computer there are homogeneous parts: the case of the computer is like the skin of the human; the processor is like the brain; the data buses and cables are like nerves and veins; but that's pretty much it. If a computer had a stomach, it would probably be the hard drive, but that metaphor is a bit of a stretch, and so you should probably ignore it.

I suppose I could carry on forever, but alas, I am not a philosopher, but merely a student, a blogger no less, and this farce is but a travesty of the great works from minds much keener than mine.

Image courtesy of Corbis Images

So don't take me too seriously :)

20 January 2011

Blog Entry #1: Minds, Brains and Programs

Commentary

See what I have to say about Wesley's and Bain's work.

References

Searle, P. (1980). Minds, brains and programs. Behavioral and Brain Sciences, 3(3), 417-457.
Chinese room. (n.d.). Retrieved from http://en.wikipedia.org/wiki/Chinese_room
Dijkstra, E. W. (1984). The threats to computing science. Proceedings of the Acm 1984 south central regional conference (pp. 3). Austin, TX: http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD898.PDF.

Article Summary

In this article, Searle sets out to disprove the possibility of strong artificial intelligence (AI), specifically that a computer program cannot display cognition as a human brain could. He does this by setting up what is known as the Chinese room, a thought experiment in which Searle is locked in a room and knows no Chinese, and yet is able to convince an outside observer that he does. Given a set of Chinese characters and a set of English rules for manipulating them, Searle states that he can receive input in Chinese, apply the English rules based on what he sees to create new Chinese data, and pass this data to output as acceptable Chinese.

Searle also addresses some of the common arguments against his position, and finally addresses the question of what he believes understanding to be. He repeats his assertions from the beginning of the article: that "intentionality in human beings...is a product of causal features of the brain"; and that "instantiating a computer program is never by itself a sufficient condition of intentionality." He concedes that it may be possible to "give" a computer the facilities that make him, as a human, intentional, but still maintains that "formal symbol manipulations by themselves don't have any intentionality...."

Discussion

Sometimes, when I dive into things like the Chinese room argument, and I start getting all metaphysical and stuff, I start to feel a lot like this :

Image courtesy of xkcd.com

In all honesty, I'm with Dijkstra:
...the question of whether Machines Can Think...is about as relevant as the question of whether Submarines Can Swim.
I really don't feel that it matters whether or not we create machines that exhibit strong AI or are capable of cognition or are intentional or whatever. As long as we're discussing thought experiments, given an unlimited amount of processing and memory resources, and a program covering enough inputs and their appropriate outputs with enough complexity, weak AI will always be strong enough for anything we really need it for. I mean, I'm all for helping the elderly cross the street or carrying groceries for the single mother with three kids:

Image courtesy of imdb.com

But once we start trying to capture the human essence and start thinking about machines as humans, who are to be afforded the same rights, who are to be worthy of the gift of our love, we run into some serious issues:

Image courtesy of imdb.com

I'm just sayin'.

19 January 2011

Paper Reading #1: Sequential Arts for Science and CHI

Commentary

See what I have to say about Zack's and John's work.

References

Rowland, D., et al. (2010). Sequential arts for science and chi. Proceeding of the Acm conference on human factors in computing systems& (pp. 2651-2660). Atlanta: http://www.sigchi.org/chi2010/.

Article Summary

Rowland and his associates performed several preliminary experiments on the relationship between sequential art and several specific scientific processes with the help of Plasq's Comic Life program. Their method of delivery was in a sequential art format: the paper is actually a comic strip. The authors cite the impact that visual media have for visual creatures such as human beings. One author puts it succinctly:
"This paper suggests that sequential art offers unique mechanisms of communication that may be of use to science."
In the first experiment, a group of children completed a science project in which they had to design an effective alternative energy solution through wind power; their teacher documented their progress. Then the children were asked to create a "photostory" with the use of the Comic Life program that depicted the experiment from start to finish.

In the second experiment, the physiological responses and facial expressions of people on theme park rides were both recorded with respect to time, and the participants were given a DVD of their faces during the ride. They were later asked to select images that corresponded to different emotions they felt while on the ride, and these images were made into a photostory for them.

Discussion

This paper is about as brief an overview of this topic as one might imagine, but to be fair, the authors were sure to point out the "preliminary [nature of their] studies" from the start. I agree that the use of sequential art could be very powerful at "keeping only the essence" of an idea, as one author points out. It is even postulated that "reality can be sampled and distilled into concepts" through the use of visual communication method. That being said, the only things I would fault this work on are:
  • drawing from one example that, from what I can see, was already pretty distilled (case in point: the childrens' alternative energy science experiments);
  • and not actually distilling the concepts out of the reality of the theme park ride experiment (or at least not providing the results of the study).
I think that the next step in this research should be to apply it to a more complex model or task, say, modeling an internal combustion engine. For example, one might want to show that:
  1. It is possible to contain enough information using sequential art to explain internal combustion (to some degree of complexity).
  2. The information contained by the sequential art is somehow comparable in complexity to the information contained by the reality of the task, i.e. in a mechanical engineering textbook.
Basically, for this idea to take hold, one must be able to prove the possibility and the efficacy of the use of sequential art in a practical situation. Case in point: what would be an acceptible level of complexity and distillation for something like this?


I, for one, would be interested to see that happen.