A Visual Analytics system for exploring Roman history.
Analyzing demographic information from social media data.
Extracting event information from social media data.
Here is a list of selected research projects.
Learning and gaining knowledge of Roman history is an area of interest for students and citizens at large. This is an example of a subject with great sweep (with many interrelated sub-topics over, in this case, a 3,000 year history) that is hard to grasp by any individual and, in its full detail, is not available as a coherent story. In this project, we developed a visual analytics approach to construct a data driven view of Roman history based on a large collection of Wikipedia articles. Extracting and enabling the discovery of useful knowledge on events, places, times, and their connections from large amounts of textual data has always been a challenging task. To this aim, we introduce VAiRoma, a visual analytics system that couples state-of-the-art text analysis methods with an intuitive visual interface to help users make sense of events, places, times, and more importantly, the relationships between them. VAiRoma goes beyond textual content exploration, as it permits users to compare, make connections, and externalize the findings all within the visual interface. As a result, VAiRoma allows users to learn and create new knowledge regarding Roman history in an informed way.
In this project, we developed a novel visual text analytics system, DemographicVis, to aid interactive analysis of such demographic information based on user-generated content. Our approach connects categorical data (demographic information) with textual data, allowing users to understand the characteristics of different demographic groups in a transparent and exploratory manner. The modeling and visualization are based on ground truth demographic information collected via a survey conducted on Reddit.com. Detailed user information is taken into our modeling process that connects the demographic groups with features that best describe the distinguishing characteristics of each group. Features including topical and linguistic are generated from the user-generated con- tents. Such features are then analyzed and ranked based on their ability to predict the users’ demographic information. To enable interactive demographic analysis, we introduce a web-based visual interface that presents the relationship of the demographic groups, their topic interests, as well as the predictive power of various features.
Analyzing large textual collections has become increasingly challenging given the size of the data available and the rate that more data is being generated. Topic-based text summarization methods coupled with interactive visualizations have presented promising approaches to address the challenge of analyzing large text corpora. As the text corpora and vocabulary grow larger, more topics need to be generated in order to capture the meaningful latent themes and nuances in the corpora. However, it is difficult for most of current topic-based visualizations to represent large number of topics without being cluttered or illegible. To facilitate the representation and navigation of a large number of topics, we developed a visual analytics system - HierarchicalTopic (HT). HT integrates a computational algorithm, Topic Rose Tree, with an interactive visual interface. The Topic Rose Tree constructs a topic hierarchy based on a list of topics. The interactive visual interface is designed to present the topic content as well as temporal evolution of topics in a hierarchical fashion. User interactions are provided for users to make changes to the topic hierarchy based on their mental model of the topic space.