Building a Data-Backed Persona
by Andrea Wiggins on 2007/11/14 | [15 Comments]
Incorporating the voice of the user into user experience design by using personas in the design process is no longer the latest and greatest new practice. Everyone is doing it these days, and with good reason. Using personas in the design process helps focus the design team’s attention and efforts on the needs and challenges of realistic users, which in turn helps the team develop a more usable finished design. While completely imaginary personas will do, it seems only logical that personas based upon real user data will do better. Web analytics can provide a helpful starting point to generate data-backed personas; this article presents an informal 5-step process for building a “persona of the people.”
In practice, outcomes indicate that designing with any persona is better than with no personas, even if the personas used are entirely fictitious. Better yet, however, are personas that are based on real user data. Reports and case studies that support this approach typically offer examples incorporating data into personas from customer service call centers, user surveys and interviews. It’s nice work if you can get it, but not all design projects have all (or even any!) of these rich and varied user data sources available.
However, more and more sites are now collecting web analytic data using vendor solutions or free options such as Google Analytics. Web analytics provides a rich source of user data, unique among the forms of user data that are used to evaluate websites, in that it represents the users in their native habitat of use. Despite some drawbacks to using web analytics that are inherent to the technology and data collection methods, the information it provides can be very useful for informing design.
Google Analytics is readily accessible and offers great service for the price, so for the sake of example, the methods described here will refer to specific reports in Google Analytics. Any web analytics solution will provide basic reporting similar to Google Analytics, give or take a few reports, so using a different tool will just require you to determine which reports will provide data equivalent to the reports mentioned here.
To illustrate the process, an example persona design scenario is included in the description for each of the five steps:Kate is an independent web design contractor who is redesigning the website of a nonprofit professional theater company. She has hardly any budget, plenty of content, and many audiences to consider. The theater’s website fills numerous functions: it advertises the current and upcoming plays for patrons; provides patrons information about ticketing and the live theater experience; announces auditions; specifies playwright manuscript and design portfolio requirements for theater professionals; recruits theater intern staff; serves as the central repository of collected theater history in the form of past play archives and press releases; advertises classes and outreach activities; and attempts to develop a donor base as well. As she gathers requirements, Kate decides to use the theater’s new Google Analytics account as a data source for building personas.
Step One: Collect Data
After Google Analytics has been installed on a site, you must wait for data to accumulate. Sometimes you will have the good fortune to start a project that has already been collecting data for months or years, but when this isn’t the case, try to get as much data as you can before extracting the reports you will use to build personas. Ideally, you want to have enough data for reporting to have statistical power, but not all sites generate this level of traffic. As a rule of thumb, less than two weeks of data is not sufficient for any meaningful analysis. One to three months of the most recent data is much more appropriate.
If it is reasonable, try to set up two profiles to filter on new and returning visitors. While some Google Analytics reports do allow segmentation, profile filtering on new versus returning visitor status gives you the best access to the full array of reports for each visitor segment. If this setup can be arranged early in data collection, then you can later draw on a profile that contains only new visitors to determine the characteristics of your personas who are new visitors, and likewise for returning visitors.
Kate has been given administrator privileges in the theater’s Google Analytics account for the duration of her contract. The theater has just one profile that includes all site traffic, so she starts off by making two new profiles with filters to include new visitors in one profile and returning visitors in the other. Kate knows that she needs a decent sample of site data, so she monitors the profiles weekly to make sure that the data is accumulating. She starts designing her personas using the existing Google Analytics profile (all visitors), and checks back later on the custom profiles to see if the segmented data can provide any new insights to add to her personas.
Step Two: Determine How Many Personas to Use
Next, determine how many personas to use—generally no less than three and rarely more than seven or eight. This gives you the number of blank slates across which to proportionately distribute the user characteristics that you extract from Google Analytics reports. If there are four personas, each will be assigned the characteristics of 25% of the site audience in each report; if five personas, each represents 20% of the site audience. Despite the fact that you’re working with statistics, you don’t have to be exacting in proportionately representing user segments; sometimes it is very important, for business reasons, to strongly represent a small user segment.
After thinking carefully about the many functions that the site has to fill, Kate looks at the Top Content report in Google Analytics to see what pages get the most traffic. She notices that most of the top pages are related to current shows, tickets and directions, and decides that she will have at least one persona represent a first-time patron who plans to travel from out of town. The other pages that are popular include the “About Us,” “People,” and “Classes” pages; “Auditions” is a little further down the page, but well above “Support Us.” Kate determines that she will create another persona to represent people interested in joining the theater company. Kate knows that fund development is important to the theater, but it doesn’t appear to be all that important to the website audience, so she decides to create another patron persona who has attended several plays and is interested in becoming a donor. She feels that these three roles can represent the audience the theater is most interested in reaching, and starts creating a persona document for each of them. She names her personas: Regina is the first-time out-of-town patron, Monica is the would-be theater participant, and Rex is the returning patron.
Step Three: Gather Your Reports
After allowing some data to accumulate, the next step is to acquire the Google Analytics reports, whether you’re interacting directly with the application yourself or someone else is providing you with reports. If you are not the person extracting data, make sure that you receive the PDF exports of reports, as these contain summary data that is not present in some of the other export formats. Whether or not you have profiles that are filtered on new versus returning visitor segments, you will be interested in the same handful of reports:
- Visitors Overview Report. In one convenient dashboard-style screen, you can get the percentage of “new visits,” or visits by new visitors, and a snapshot of other visitor characteristics.
- Browsers and OS Report. While you can look at browsers and operating systems separately in other individual reports, it usually makes more sense to look at them in combination in the Browsers and OS Report. Typically only a handful of browser and operating system combinations are required to represent well over 90% of the site’s visitors.
- Map Overlay Report. To use this report, which provides a great deal of detail on the geographic origins of site visits, you will need to do just a little bit of math. Divide the number of visits from the top country or region (whichever is of greater use to you) by the total number of visits to get the percentage of visits from that geographical area. This allows you to determine the proportions of domestic and international visits. For the visits from your country, you will want to drill down to the city level and select a few cities from the top ranks of the list, keeping in mind that big cities will statistically generate more traffic than small ones. For your international visitors, choose from the top cities in the countries that bring the most visits.
- Keywords Report. This report shows the queries that bring users to your site. When you look at the search engine query terms, ask yourself, “What are our users looking for? What type of language do they use when searches bring them to our site?” This gives you a starting point to think about user motivations and goals.
- Referring Sites Report. Like the Keywords Report, the Referring Sites Report gives you an opportunity to look for answers to questions like, “Where do our users come from? Are they reaching our site from search engines, other sites, or just appearing directly with no referrer, as returning visitors are more likely to do?”
If you have the segmented profiles set up, extract the same reports from both of these profiles, and get the Visitors Overview report from an unfiltered profile.
Kate starts looking for report data to build her personas. She has already generated user goals for her 3 personas, but the goals are pretty general, so she hopes to find more specific characteristics that are based on the real user population. Kate consults the Visitors Overview report and find that about 75% of the site’s visits in the last month were from new visitors; she decides that the Regina and Monica personas will be new visitors to the site and quickly brainstorms a few questions that she thinks they might have, based on their goals, that motivate their site visits. The last persona, Rex, will be a returning visitor.
Kate knows that the overwhelming majority of patrons are local because it is a regional theater company. She checks the Map Overlay report and sees that at the state level, about half of the visitors come from Michigan, where the theater is located. She decides that Monica comes from another state, and picks New York because it’s in second place behind her state, and because of the level of activity of the theater community in New York City. Kate drills down to view the traffic from Michigan, and chooses the top city for Rex’s home–the city is near the theater, so this makes intuitive sense. For Regina, who is planning to travel a little further, she selects the #4 city, which is about an hour away, and is a much bigger city. The visitors from that city have longer visits and a lower bounce rate, so she feels these characteristics would match well with Regina’s goal of planning an out-of-town visit to the theater. Coming from that city, she will also want to have dinner and stay the night at a local bed-and-breakfast, so Kate jots down these additional goals for Regina.
Since two of her personas are new visitors, Kate looks up the Traffic Sources Overlay and then the Referring Sites and Keywords reports. There’s a lot of search engine referral traffic, and some strong referrers among regional event listings sites. She decides that Regina got to the site from an event listings site that refers a lot of traffic, and that Monica arrived from a Google search on the phrase, “auditions in Michigan.” Kate thinks that a logical reason Monica would be searching for auditions in Michigan is because she’s planning to move there from New York, so Kate adds this detail to Monica’s persona.
Step Four: Fill in the Blanks
The next step is to “fill in the blanks” from the report data. Make a template for each persona, and first fill in whether they are a new or returning visitor. If you have segmented profiles on new versus returning visitor status, draw the remaining characteristics of your new visitors evenly from the new visitors profile, and likewise for the returning visitors. When you have distributed the other statistics (browser, operating system, and geographical location) among your persona templates, review them against the unfiltered “all visitors” profile for a reality check to make sure you have not unintentionally over-represented a user characteristic, which is one hazard of using segmented data. If you have no preconceptions about user goals, you can distribute the report characteristics randomly at this point, as there is not necessarily much meaningful interplay between the statistics for new/returning status, geographic location, and browser/OS. Alternately, using a goal-oriented approach as in the example, you can select persona characteristics from the user data that make sense with the goals you have established.
Kate took a goal-oriented approach to building her personas, so she has already assigned the report data to the personas. She builds her normal persona description template with the notes she made while looking at reports and adds OS and browsers based on the Google Analytics report to each of them. Kate then starts drilling down into the Google Analytics reports’ segmentation to add more detail. She clicks on Rex’s city in the Map Overlay to check the average visit length, bounce rate, and number of pageviews in the visit, which she uses to help her think about which pages Rex would be looking at, given his goals and those averages. Visits from Regina’s city are a little longer, so Kate considers what pages might show up, and checks the event listings site that referred Regina’s visit to find out what Regina might already know before visiting the theater’s site. Kate also checks on the referrers and keywords for visits from NYC and verifies that they contain some phrases similar to the one she chose for Monica.
Step Five: Bring the Personas to Life
The fifth and final step is to breathe life into these rough skeletons of personas. This is the familiar practice of generating the rest of the fictitious biography of the user, the detailed picture of who that person is and what motivates her or him, and so on. Let your creativity take over and build off the initial characteristics from the web analytics data to create a coherent persona. For example, the assigned browsers and operating systems should guide the determination of the computer makes and models that your personas use. Use the new or returning visitor status to assign the personas a level of comfort with using your site and their motivations for the site visits. The geographic location determined from the user data can help generate appropriate user goals and challenges, as well as occupations and hobbies, which may differ for domestic and international users. The reports on Keywords and Referring Sites offer insight on visitors’ interests and motivations, albeit slightly abstracted, and are a good starter for writing usage scenarios.
Kate spends some more time fleshing out her personas, and eventually decides that she needs more information about Rex, the returning patron and would-be donor. She asks the theater for some information from their patron database about how often regular patrons from Rex’s city visit the theater. Kate also interviews the company’s Development Director to gain more perspective on the characteristics of the theater’s existing donors from the local area. After learning more about the types of donors that the theater attracts and the general giving patterns they have, Kate feels that Rex is a good representation of the kind of potential donor who would visit the theater’s website repeatedly, and adds in some additional details based on her interview with the Development Director.
If you have other sources of user data, this is a great time to work it in. Survey data can often provide useful demographics that web analytics cannot, like users’ age, sex, and education level, for example. Free answers from surveys, interviews and focus groups are great sources of inspiration for filling in the details that make personas come to life. The Google Analytics Keywords report can sometimes provide the very questions that bring users to your site–and where better to answer them than in the design process?
Even when there is relatively little user data available to aid in the process of persona development, leveraging the resources at hand creates a stronger design tool. The 5-step process presented here aims to provide a starting point for developing personas using web analytic user data, rather than relying solely on assumption or imagination. An evidence-based approach like this one can lend structure and credibility to using personas and scenarios in the design process. At the same time, user data and statistics must be creatively synthesized to produce a useful representation, and imagination is always required to transform a user profile into a persona.




Readers' Comments (15)
Jonathan Baker-Bates
17 Reputation points
Posted 2007/11/15 @ 07:25AM with
Alec Cochrane
1 Reputation points
Posted 2007/11/15 @ 10:03AM with
Andrew Otwell
10 Reputation points
Posted 2007/11/15 @ 12:26PM with
Jonathan Baker-Bates
17 Reputation points
Posted 2007/11/16 @ 04:09AM with
Mark Dykeman
3 Reputation points
Posted 2007/11/16 @ 08:05AM with
Andrea Wiggins
97 Reputation points
Posted 2007/11/16 @ 08:19AM with
Andrea Wiggins
97 Reputation points
Posted 2007/11/16 @ 08:27AM with
Andrea Wiggins
97 Reputation points
Posted 2007/11/16 @ 13:05PM with
Robert Skrobe
3 Reputation points
Posted 2007/11/16 @ 15:58PM with
Robert Skrobe
3 Reputation points
Posted 2007/11/16 @ 16:06PM with
Alistair Harper
3 Reputation points
Posted 2007/11/20 @ 08:10AM with
Robert Williams
4 Reputation points
Posted 2007/11/26 @ 11:32AM with
Paula Thornton
1 Reputation points
Posted 2008/01/29 @ 14:16PM with
Mark Dykeman
3 Reputation points
Posted 2008/01/31 @ 12:55PM with
Brian Regienczuk
3 Reputation points
Posted 2009/06/04 @ 19:15PM with