Hi, I'm Johara

I find insights through user research and data synthesis. I'm currently pursuing my MS in Data Science at the University of Michigan in Ann Arbor.

My techniques range from data mining and machine learning, to interviews and field studies. I also like presenting my findings through visual stories. I've found that engaging visuals create a better experience for people to explore and connect with.

Previously I worked as a UX analyst at Zenuity, a research specialist at Center for Complex Engineering Systems (CCES) and as a research affiliate at the Massachusetts Institute of Technology. I am passionate about working with people and data to find solutions to real world problems.

In my life, I enjoy cooking, salsa and yoga :-)

Gendered Spaces

In Saudi Arabia, gender shapes cities in a way that is not commonly found in other cities due to Saudi Arabia’s imposed gender segregation. This segregation policy drives both genders to different areas of the city in different ways, influencing the emergence of gendered spaces. In this work, social media data is utilized to better understand gendered spaces throughout the city of Riyadh.
I developed an algorithm to perform gender annotation based on user first names. The method, optimized for English and Arabic language names, was applied to a sample of over 120,000 geotagged tweets between November 2016 and January 2017. The customer demographics of Foursquare venues were estimated based on the gender ratio of reviewers. Areas with a high degree of gender concentration in these datasets were used to identify gendered spaces. The correlation between gendered space identified from tweets and Foursquare venues was used to examine the link between amenities and gender-specific mobility habits in Riyadh. Throughout this analysis, the aim is to identify ways in which government policies and the organization of businesses and services with similar customer demographics impact the mobility patterns of women and men and lead to the emergence of gendered spaces in Riyadh.

Alfayez, Aljoharah, et al. "Understanding Gendered Spaces Using Social Media Data." International Conference on Social Computing and Social Media. Springer, Cham, 2017.

Saudi Data Portal

A joint project with Saudi Arabia’s General Authority for Statistics. The objective of the portal is to house the rich statistical information collected by the authority and to present it to the public in an easy-to-use and interactive way. The portal is currently under development and will be published by the beginning of 2018.The objective for the website is not only house data, but to also influence civic engagement and public empowerment through data transparency and inclusion, holding policy makers and data publishers accountable through public scrutiny.
My role in this project is in the design and development cycles. I worked first on developing meaningful data stories from unstructured data, to promote data understandability and to give proper context to the data. Afterwards, I contributed to developing the visualizations using D3 and AngularJS and the means of automating the production of these visualizations given the structure of the data to be visualized. Additionally, I conducted an evaluation and comparative study of similar open data platforms. One important aspect to consider for the website is choosing the right graphics and semiotics to represent the content, I am currently working on a paper that captures parts of this process titled ‘A Semiotics Analysis of Icons in Open Data Portals’ which will be published soon.

CA Optimization Engine

As part of Saudi Arabia’s plan to build a sustainable economy, the country has announced that the it’s income must no longer depend extensively on oil. The plan is to diversify the nation’s economy and limit the extensive subsidies that are currently in place. The Citizen’s Account (CA) program is introduced to provide financial support to citizens that are economically vulnerable to the increase in the cost of energy and water. The Ministry of Economy and Planning (MEP), the Ministry of Finance (MOF), and the Ministry of Labor and Social Development (MLSD) are leading the effort to design a program that will help achieve this vision.
The primary objective is to identify individuals in need of more financial support based on certain inputs. As part of this team, we recognized the need for policy makers to explore the results effectively and efficiently. So, we have provided the ministries with an interaction tool that enables stakeholders to explore, experiment and analyze the results without the need to deal with the complexity of the models under the hood.

CA Optimaztion Engine
My role in this project was to build the optimization engine. The optimization engine accepts three groups of inputs from the user. An objective function, range of possible values for the optimization variables, and the range of remaining variables (default values are provided)
The final output of the engine is an optimal scenario identified based on the user’s inputs. In addition to displaying the value of each lever in the optimal scenario, the interactive platform also allows the user to display the results of the optimal scenario on a geospatial plot and compare the KPIs to predefined scenarios on a scatter plot.

In the figure above we can see a comparison of the predefined scenarios (red) and the scenarios recommended by the optimization engine (blue) when a core value of 370 is used.

Taxi Analysis

I describe one approach of land classification through linking taxi drop-off cost to traffic analysis zones (TAZs). The number and costs of taxi drop-off points in the city of Riyadh are visually explored to identify social and urban behavioral patterns. After analyzing the gender annotated data, some expected gender biases in the data set are identified, since female mobility is prohibited in Saudi Arabia and public transportation options are limited. Finally, we visualize the number and cost of drop-offs per TAZ for males and females and identify potential areas for future research.


Alfayez, Aljoharah, and Salma Aldawood. "Visual Exploration of Urban Data: A Study of Riyadh Taxi Data." International Conference on Social Computing and Social Media. Springer, Cham, 2017.


Frisk is an information extraction system, it extracts events and contacts from Arabic emails (fetched from the user's Gmail inbox) and suggests adding them to Google Calendar or Google Contacts. Frisk consists of a text processing unit and a gadget that displays the results.

The extraction service is composed of three components:
- The first component is the pre-processing component that prepares the received text for annotation.
- The second component is responsible for adding annotations, this will be done through the use of the GATE framework (General Architecture for Text Engineering) supplemented by adding our logic to it to make the process tailored to our needs.
- Finally, the third component is the post-processor that combines the annotations to create events/contacts using algorithms written by us, and arranges the results in a format understood by the gadget.

In the end, the gadget will be communicating with this service to retrieve the results and display them to the user where he/she can choose to add them to Google Calendar/Contacts.

Traffic Mortality

Contributed to Saudi Traffic website visualizations, this website has gained traction in Saudi Arabia.

Inclusive Wealth

I designed, developed and delivered the Inclusive Wealth interface which gives users an opportunity to explore inclusive wealth data through a storytelling approach.

Inclusive Wealth

Dust and Demand

There is scarce research aimed at identifying a relationship between air quality and power demand. In this work, we conduct a study on the city of riyadh where we compare air quality, power demand and temperature data to identify patterns of electricity consumption during dust storms.

Smart Placement

Smart placement is a tool that aims to support business owners make well informed decisions that may impact the success of their business, it is a recommender system for store placement that relies on user priorities. The tool uses different factors to construct a network of amenities and study their co-location patterns, associate social media popularity factors to those amenities, look at gender distribution across these amenities and use our existing traffic model to look at traffic flows related to these amenities. Based on that we can recommend the best store placement given a user’s priority of factors.
Using both our previous work and techniques from existing literature, we can evaluate both neighborhoods and individual stores based on a number of characteristics.
We can integrate our results on top of the city dynamics platform.
Our initial task is to establish a rating system that can provide rankings for a selected location and the surrounding area for the following criteria:
1. Amenity space (co-location network of amenities):
By using the type and location of existing points of interest, we can construct a network of different amenity types that describes how often they tend to be found near one another. This can be compared to the distribution of amenities in a specific neighborhood to evaluate its capacity for additional locations.
2. Gendered locations (e.g. Twitter, Foursquare):
We currently have a database of over 1,000 Arabic names (written in both Arabic and English) and their associated gender. This can be used to identify the gender of users on social media applications such as Twitter and Foursquare. From geo-tagged tweets, we can use this data to understand where men and women go throughout the city. This will help us determine concentration of the target demographics throughout the city.
3. Number of visits and traffic flow:
Building upon our previous work, we can incorporate existing mobility patterns in our analysis to consider visitors to a given area. We can determine how many people visit a particular area throughout the day and estimate how long they stay there. Using the transportation model, we can also estimate the flow of traffic passing through an area (which can be desirable for businesses that rely on exposure) and evaluate the accessibility of a location for the surrounding population.



I am also contributing to a paper titled "Measuring the Realistic Commute Cost for Saudi Females in Riyadh And Its Impact on Employment" which measures employment accessibility in terms of commute cost for women, and in the study, we measure the impact of accessibility on employment through a multivariate regression model. This work has not been published yet.