Understanding Twitter Username Dynamics

Week 10.2: On the dynamics of username change behavior on Twitter

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    This lesson delves into the dynamics of username change behavior on Twitter, derived from an extensive study involving 8.7 million users tracked for two months. The study found that 73% of users modified their profile attributes, with 10% changing their usernames. Notably, 20% of the users triggered 85% of these changes. The motivations behind these changes include gaining more space for text due to Twitter's character limit, participating in trending events, achieving anonymity, adjusting to real-life changes, or malicious intentions like username squatting. The analysis highlighted a weak correlation between a user’s popularity or activity and the frequency of username changes. The researchers also faced data collection challenges due to the limitations of Twitter’s API restrictions, which dictated a randomly sampled dataset from the larger user pool.

      Highlights

      • 20% of users cause 85% of username changes due to factors like space gain and trend participation. πŸ“ˆ
      • Some users change usernames to gain anonymity, while others may do it for malicious purposes like squatting. 😎
      • Figure 3 in the lesson shows the distribution of frequency versus users changing intentionally or rarely. πŸ”
      • Challenges in the study include API limits from Twitter, which restricted more extensive data collection. 🌐
      • The findings could help improve online privacy and identify behavior patterns related to social media identity changes. πŸ”‘

      Key Takeaways

      • 73% of Twitter users change their profile attributes, while 10% change their usernames. 🀳
      • Username change behavior often follows the Pareto principle, where 20% of users make 85% of changes. πŸ“Š
      • Reasons for changing usernames range from gaining space to participating in trends or evading identification. πŸ”„
      • Username squatting is a common issue on Twitter, akin to domain squatting. πŸ¦…
      • There is a weak correlation between a user's popularity or tweet activity and the frequency of username changes. πŸ“‰

      Overview

      In this session, we explore the intriguing world of Twitter username dynamics, a topic dissected through a meticulous study of user behavior on the platform. Over two months, researchers tracked 8.7 million Twitter users to observe how often usernames and profile attributes were altered and the reasons behind such changes.

        The study highlighted that a significant chunk of users altered their usernames for reasons ranging from pragmatic needs, such as gaining more tweet space, to more complex motives like engaging with trending topics, hiding their identity, or making a fresh start. Interestingly, 20% of the users accounted for a majority of the username changes, showcasing a pattern similar to the famed Pareto principle.

          Despite these revelations, the study faced certain constraints, primarily due to Twitter's API limitations, which capped the data collected at 10,000 users for frequent observation. This restriction paved the way for future research avenues, suggesting that scaling up the dataset could provide more comprehensive insights into username change behaviors across different user demographics.

            Chapters

            • 00:00 - 00:30: Introduction to the course and paper topic The chapter provides an introduction to the course titled 'Privacy and Security in Online Social Media' offered on NPTEL. It sets the stage for studying various patterns related to privacy and security in online social media. The focus is on continuing the analysis and understanding of different patterns and behaviors observed on social media platforms.
            • 00:30 - 01:00: Brief overview of the paper This chapter provides a brief overview of a paper titled 'On the dynamics of username changing behavior on Twitter'. It touches upon the topic of username changes by Twitter users, exploring the reasons behind these changes, the frequency with which they occur, and other related dynamics.
            • 01:00 - 01:30: Paper's research and findings The chapter titled 'Paper's research and findings' delves into the individuals responsible for changes and the benefits derived from altering user handles. The paper is highlighted as intriguing with significant implications. The abstract indicates that prior research demonstrates considerable findings, setting the stage for detailed exploration in the following sections.
            • 01:30 - 03:00: Research methodology and dataset The chapter discusses the methodology and dataset used in a study of Twitter user behavior, specifically focusing on changes in usernames over time. Researchers examined 8.7 million Twitter users over a two-month period. The data collection process is noted to be complex, but it provides a high-level conclusion that only a few users frequently change or favor certain usernames.
            • 03:00 - 04:00: Analysis of username change behavior The chapter titled 'Analysis of username change behavior' examines the patterns and reasons behind why users change their usernames. It concludes that a small group of people change their usernames multiple times, while a slightly larger group changes them infrequently. The chapter also investigates the motivations behind these changes.
            • 04:00 - 05:30: Correlations with popularity and activity The chapter titled 'Correlations with popularity and activity' discusses the use and growth of Twitter. It covers how Twitter is utilized by users, the types of data shared on the platform, and mentions an abstract which summarizes the main points of a paper related to these topics.
            • 05:30 - 06:00: Survey and user reactions The chapter titled "Survey and user reactions" discusses findings from a dataset involving 8.7 million Twitter users over a two-month period. It highlights that 73.21% of these users changed their profile attributes and assigned new values during this time. Additionally, around 10% of users were noteworthy for changing their attributes.
            • 06:00 - 07:30: In-depth analysis of username change reasons The chapter provides an in-depth analysis of the reasons why users change their usernames. It reports that approximately 73 percent of the users alter their profile attributes. A graphical representation shows different attributes on the x-axis and the percentage of users who made changes on the y-axis, with colors indicating the frequency of changes.
            • 07:30 - 09:00: Patterns of username use and conclusions This chapter examines the patterns of username changes among Twitter users. It reveals that approximately 73.21% of 8.7 million users change their attributes, whereas only about 10% of users change their username. The chapter concludes by summarizing the contributions of the paper, though specific contributions are not detailed in the transcript.

            Week 10.2: On the dynamics of username change behavior on Twitter Transcription

            • 00:00 - 00:30 Welcome back to the course Privacy and Security in Online Social Media on NPTEL. So, what we will do now is continuing the pattern that we have been doing for studying the analysis, studying how different kinds of patterns can be analyzed on social media.
            • 00:30 - 01:00 I am going to look at this paper called 'On the dynamics of username changing behavior on Twitter'. I think we have mentioned in this topic briefly in the past, which is that how users actually change their usernames, why do they change, what level of frequency does the change happen,
            • 01:00 - 01:30 who are these people who are changing it, and what are the benefits by changing the user handles that is what we are going to look at. This is an interesting paper which has some interesting implications also. So, in the abstract, it looks the author say that past studies show that a substantial
            • 01:30 - 02:00 section of Twitter users change their username over time. And authors actually look at 8.7 million users on Twitter for duration of two months. The data collection is slightly interesting and complicated also; we look at actually as we progress. So the high level conclusion from the study is that few favor a username by repeatedly
            • 02:00 - 02:30 choosing it multiple times. So, essentially what the paper would conclude is that there is small set of people who actually change their handles many times and there are slightly larger set of people who change it very less number of times, and the paper also looks at the reasons why people actually
            • 02:30 - 03:00 change their usernames. As I have said before, abstract only summarizes what is in the paper then in the introduction you talk about the whole growth of Twitter, in terms of why it is, how users are actually using it and what kind of data is being pushed onto twitter, so that is what is being discussed
            • 03:00 - 03:30 here. So, the conclusion that, one of the conclusions that the author have is in our dataset of 8.7 million Twitter users tracked for two months, they observe that 73.21 percent users change their profile attributes, and assign new values, about 10 percent of users changes
            • 03:30 - 04:00 their usernames in total. So, this is basically showing you that about 73 percent of the users change their profile attributes and assign new values. So, you can just see here that this is the x-axis is the different attributes, and y-axis the percentage of users who have changed. And the color here mentions the different number of times that the changes were made.
            • 04:00 - 04:30 Two means twice the value users actually changed, and then three times, four values and five values changes. 73.21 percent of the 8.7 million users change their attributes on Twitter and just about 10 percent of users change their username. So, now let us look at the contributions of the paper.
            • 04:30 - 05:00 So, there are three contributions on the paper. 20 percent of users trigger 85 percent of username changes, again this is the same Pareto principle that we have seen in the past, or a power law pattern that is 20 percent in
            • 05:00 - 05:30 users trigger 85 percent of username changes. Observed to change 5 time or more. Username changing behavior follows a Pareto principle, 10 percent username change occur after an hour of the earlier username change. I think the username changing pattern is also interesting because there are multiple reasons
            • 05:30 - 06:00 why people change it, people change, because they want to get some space. For example, if at all if my account when I started was Ponnurangam Kumaraguru, which is pretty long and if I change the account to Ponguru, I will start getting when users tag me or mention me in their post, they would get actually more space to write the content.
            • 06:00 - 06:30 And all this is happening just because there is space constraint in Twitter. Whereas, in facebook if you see there is lot more space for the content and therefore, facebook actually allows you to change your username only once. Twitter allows you to change as many numbers of times as possible that is the reason why this problem is actually appearing.
            • 06:30 - 07:00 65 percent of users choose a new username unrelated to the old name, while 35 percent reused an old one sometime later. You will actually see a table later also where there is a small set of people who actually collude that is in a group, they would actually use the same name and different users will
            • 07:00 - 07:30 start using the same handle within the group. The reasons to change username include benign reasons like space gain, suit a trending event, a gain or loss anonymity, adjust to real-life events, avoid boredom and malicious intentions like obscured username promotion and username squatting.
            • 07:30 - 08:00 I will just tell you quickly what they are and then when we go into the paper as we move forward, we can actually look at them in details. Space gain, I said the Ponnurangam Kumaraguru to Ponguru. Suit a trending event the some event that is going on let us take IPL – cricket, football I would change my handle to look very similar to them, and therefore, I will get more of attraction. Gain or lose anonymity, I create in account Ponguru which is Ponnurangam Kumaraguru which
            • 08:00 - 08:30 is probably very identifiable, whereas if I have an account saying a guy from Chennai, the anonymity is pretty high. Adjust to a real life events and things are changing in my life. So, earlier let us take I was a grad student I could have a graduate in my user handle,
            • 08:30 - 09:00 but as now I am a professor so, could actually use professor in my user handle. Avoid boredom, it is boring since I have been using Ponguru for a long time, malicious intent is user obscured username promotion, I could actually create an account which is, change my user handle which is very similar to somebody who is popular and actually get my handle
            • 09:00 - 09:30 promoted and username squatting. I could actually register for an account called Amitabh Bacchan, now and keep it for me whenever Amitabh bacchan actually wants to create an account, they would actually have to take the account from me. This username squatting is actually pretty popular problem in terms of the usernames also.
            • 09:30 - 10:00 There was an incident even in India when the current government in central wanted to have an account there was an issue of PMO India. So, squatting of that handle like ponguru for somebody to actually use it is a problem and this is a traditional problem in general domains also. Somebody could squat URL called pmoindia.in or pmoindia.com or apple.com, and they could
            • 10:00 - 10:30 actually have others to pay for. I think there was in there was an experience with the housing.com when they wanted the domain there was squatting for that domain name.
            • 10:30 - 11:00 So, now let us look at actually the data set collection. So, as I said before in related work that there is mention of three different types of domains in this three different areas that this paper attacks which is evolving user behavior how users are actually changing the behavior online. And the second one is profile linking, which we have seen in this course before in terms
            • 11:00 - 11:30 of actually connecting to user handles and finding out whether there is a same identifying malicious. All these three different types of domains actually come into this research work, so the authors actually mention about these related work. Data collection, so in terms of actually the total data that was collected, the authors actually created a large seed data set, track the seed set of for two months every fortnight,
            • 11:30 - 12:00 find users who change usernames more often than others, filtered theses users and track their profiles every 15 minutes. Essentially, what others did was they took the large data set and they were they tried tracking it every fortnight. And for the small data sets, small user set from this larger data set, they were actually
            • 12:00 - 12:30 tracking it for every fifteen minutes. We will actually explain later why this is this approach authors took this approach. So, I think then users who participated in the 17 local and global events during April
            • 12:30 - 13:00 1, 2013 to September 3, 2013. So, essentially there where they have to be some ways of collecting the users So, one approach that they took is events between April and September, 2013 all the people who posted at about these global events the handles were taken and they were actually the data for these 8.7 million users were collected which is about that users handles.
            • 13:00 - 13:30 Seed tracking, now from now on 8.7 users is the seed users. 8.7 million users for any username changes by querying them every fortnight within a period of October 2013 and November 2013. So, 8.7 million users, every fourteen days go and check whether they have actually changed
            • 13:30 - 14:00 their handle. By comparing two consecutive scans, old and new usernames of a user were recorded, which is if fourteen days before, if my account was Ponguru, and today my account is ponnurangam dot kumaraguru both of them are actually captured. Twitter usernames are case-insensitive; therefore, any changes, any case changes were not counted
            • 14:00 - 14:30 as username changes. We found that 853,827 users of the 8.7 million users which are about 10 percent, changed their usernames at least once during a small observation period of 2 months. In these 2 months, 10 percent of these 8.7 million users changed their user handle which is actually pretty large, 10 percent of users changing their user handles.
            • 14:30 - 15:00 So, now how do you, meaning we cannot actually collect all the users, 8.7 million users for very frequent data collection. So, the authors actually decided to sample, so tracking users who do not participate in such behavior had little value, which is the users who do not change the behavior we therefore,
            • 15:00 - 15:30 filtered users, 711,609 users who changed their usernames at least once and randomly sampled 10,000 users to monitor them for a short intervals. The idea is to find out people who are changing their usernames and from their take the usernames and create a small sample to collect data, and the big reason why we want to actually,
            • 15:30 - 16:00 the reason why authors actually choose to collect a smaller data set of only 10,000 is that. If you had to make so much of API calls to Twitter, it is going to be impossible, so that is what they call. That is what the author say here if you look at it, quicker scans would need 1,462 application authentication tokens.
            • 16:00 - 16:30 And therefore, it is going to be actually hard to do that. Now that we have seen the different types of seeds that the authors used. Here is the table that actually gives you the details in a such things formed, fortnight scan October 16, 2013 to November 26 2013, 8 million users - 15 minute scan November
            • 16:30 - 17:00 22 to January 22 of 2015 10,000 users. Out of the 10,000 users, 4,198 users changed their usernames at least once in 14 months. Constituting 14,880 username changes, about 20 percent users changed 5 minutes or more
            • 17:00 - 17:30 triggering around 12,648 - 85 percent of user name changes. And so we will see the figures also. So, one user changed her username 113 times in fourteen months which on manual inspection turned out to be an inorganic user with half completed tweets, tweets with the same text, and frequent posts in short duration So, it is essentially saying that it is not necessarily
            • 17:30 - 18:00 legitimate or human being user. So, the conclusion that you want to remember is 10,000 users, 4,198 users changed their username at least once in 14 months. So, the reference here is to figure 3. Let us go look at figure 3. So, here is figure 3, which actually shows user distribution for frequency of changing
            • 18:00 - 18:30 user names, 20 percent of the users frequently changed this usernames and 80 percent of the users change rarely. So, again like the last week paper that we saw, it is a percentage of users where you can actually see that the first part until about 10 or 12 is actually very short.
            • 18:30 - 19:00 So, there the insight is actually giving you the more detailed view of the data, frequency of username change versus number of users. So, one user changed, one user changed it 113 times in 14 months.
            • 19:00 - 19:30 Around 20 percent of the username changes were triggered within a day of the previous username change. Observe a Pareto distribution with 20 percent of the users frequently changing usernames in short intervals, and 80 percent of the users changing rarely after long duration So, this is why this is in figure 2(a) that is the distribution that you want to actually
            • 19:30 - 20:00 look at. So, again there is an insight here to show the number of days for a username change, this is 0 to 600, whereas this is just showing 0 to 1. And the frequency of username changes frequency of username changes here right, this is the
            • 20:00 - 20:30 percentage of username changes So, (a) is giving you that, normalized longest common subsequence length, we'll see all of these, position of change relative to usernames. So, now look at the usernames itself. Specifically targeting only looking at the usernames, we actually see popularity versus
            • 20:30 - 21:00 frequency change. We measure popularity of 4,198 users using followers, that is in degree you know what in degree is, and plot it against a frequency of username change which is number of followers that I have versus the number of times or my username changes. This will actually be interesting results like whether popular users who are having
            • 21:00 - 21:30 a lot more followers are actually changing the usernames more frequently versus people who have lesser number of followers. To find the correlation between the two, authors basically removed a everybody who had greater than one million followers, and too less which is less than one for us, because there is no sense in having or both these types of users because it will actually not, it will
            • 21:30 - 22:00 basically skew the analysis that we are looking at. We observe that username change frequency is weakly yet positively correlated with the in degree of username, which is a significant positive correlation imply that higher the popularity, higher is the frequency of change, however, weak correlation does not guarantee
            • 22:00 - 22:30 the same. Which is that we in this case we only have authors only found weak correlation. So, there may be a chance that the number of in degree followers is actually affecting the username changes, but it may not also be effective. Figure 4 (a), so if you see here, this shows number of followers from 1 to 10 to the power
            • 22:30 - 23:00 of 6, frequency of username changes. So, this is not a really meaning, if it would have been a positive correlation we could have actually seen all data like this, which is as the number of followers increases, first percentage of times the username changes increases then it could be a linear graph. Whereas, this graph is showing you that it is not really positively correlated. Also if you look at this another metric which is percentage of tweets posted versus frequency
            • 23:00 - 23:30 of username change, weak correlation imply that popularity and activity has a little impact on choice of change in username. Which is the second graph is here figure 4(b) shows the frequency of username change with the users' activity.
            • 23:30 - 24:00 To find correlation between the two, again the more greater than ten thousand tweets less than one tweet, we observed a weak and a positive correlation between the two, same as the number of followers that is so we have weak correlation between these two, number of tweets posted and the frequency of username change. So, that is gives you a sense.
            • 24:00 - 24:30 So, let us go overall the analysis that you have seen until now, it basically says that 20 percent of the users change five times, that there are people who have changed 113 times and there is weak relationship between popularity and the number of post somebody posts for the username changes that is what we learnt until now. So, for studying, actually the reasons why people change the name, Paridhi, the first
            • 24:30 - 25:00 author of the paper actually tried doing some interesting things. Once she created a survey with the some questions asking why users actually change their usernames. And she posted tweets tagging the users who had actually changed the usernames.
            • 25:00 - 25:30 Interestingly we got some both very positive reactions and very negative reactions also. There were people who actually said why you actually tracking us, who are you and why you actually understanding username changes that I have done, why you asking me all these questions.
            • 25:30 - 26:00 Some users actually reacted with their, gave the reasons, why they are actually changing their names, usernames and actually explain things. Here is a some reasons that we said in the abstract, what are the reasons that people could actually changes the usernames, space gain, of course, Ponnurangam Kumaraguru versus Ponguru would help them change their get more text into the posts.
            • 26:00 - 26:30 Let us also look at the figure 5, which actually is showing you that the length difference between the names. So, if you look at the x-axis, it is a space gain being in their x-axis and y-axis is old
            • 26:30 - 27:00 username length which is to find out that if I moved from ponnurangam to ponnurangam kumaraguru to Ponguru versus Ponguru to ponnurangam kumaraguru, what is happening and why are users doing that. So the authors calculated the length difference between the new and the old name of users, and separately represents users of the old names less than and greater than the median
            • 27:00 - 27:30 length of eleven that was because I think the data itself showed that the median length of user handle size was 11. Authors observed that 75 percent of long usernames moved to short or the same length new usernames. 75 percent, 75.19 percent of long usernames moved to short or same length usernames.
            • 27:30 - 28:00 While 60.87 percent short usernames picked long new usernames So, it is kind of a same kind of percentage of people are actually flipping from small to big, and big to small. In other words, most users with old usernames of less than 11 tend to add characters in their usernames. While most users with old usernames greater than 11 prefer to remove characters from their
            • 28:00 - 28:30 new usernames; old username length greater than 11 which is shown as red here they are all moving from, so basically space gain if I actually reduce my user handle size, I am getting actually more space. If I am getting increasing the characters, I am losing space that is what is positive
            • 28:30 - 29:00 and negative here. It moves from, so old usernames old username less than 11, where if you see here old username less than 7 tend to add characters, so that is what is here, a space gain and this is negative space gain. old username less than eleven characters.
            • 29:00 - 29:30 So, this is blue is old username is less than eleven characters. So, they are actually getting space gain which is positive. Old username greater than eleven which are getting, they are adding more characters. So, they getting space gain negative, they are losing space. That would help you to understand what kind of username changes are happening on Twitter.
            • 29:30 - 30:00 In terms of the 10,000 users that the authors are actually analyzing. Maintain multiple accounts, few exchange usernames with the multiple accounts. Which is I maintain three accounts and I actually keep changing the usernames between these three accounts. So, the users, some users in the data set change username to reverse the identifiability
            • 30:00 - 30:30 of the users, either to make them personal or to anonymous. So, for example, I could actually have a username Ponguru, which is probably identifiable and I move from Ponguru to professor from Chennai and that would make it anonymous, compared to professor from Chennai to Ponguru which will make it more identifiable. So, just take a look at this table, this is something I mentioned earlier, but I will
            • 30:30 - 31:00 actually explaine what happened in the data set now. So, you should look at the first column which is the id, which is the unique id that the users have on Twitter, we have put x so that you can also not identify the users for now. Scan 1 - Peshawar underscore sms; scan 2 - Peshawar underscore went to the next user in row 2.
            • 31:00 - 31:30 And scan 3 - Peshawar underscore sms went to the user 3. So, it is a same group Sajan group and given that we were actually tracking, the authors were tracking every 15 minutes we could actually find out that every scan at the user handle with different sets of users from the same group.
            • 31:30 - 32:00 It could be the case that this all these handles are actually managed by the same person that is a probability there, but what we found was this - that usernames within a group have been shared and people actually use different account starts using the same user handle. So, the last column is date of observation which is showing you that we captured data
            • 32:00 - 32:30 in different snapshots. So, looking at the reasons for actually username changes; adjust to events which is one user actually said that and the user was actually associated with an event, the event finished and the user started connecting with the another event and then, so they handle was changed
            • 32:30 - 33:00 for example, pwifanclub to ForceIndia. So, this is the explanation that I did with the table two, which is the authors found that a few users collaboratively pick the same username at different times stamps, and the table I have already walked you through, so which actually gives you a sense of how
            • 33:00 - 33:30 the handles have been managed. Username squatting. Username squatting is actually against the Twitter rules, but users actually generate user handles and keep it, so that they can actually monetize them when necessary, when other users are actually wanting to have these handles. So, that ends the paper. So, this is essentially a paper which talks about how much frequently the users change
            • 33:30 - 34:00 the handles, why do they change the handle, what kind of patterns, what is the relationship between number of, who's changing the handle - people who are popular versus people who are not popular, people who post a lot more text versus the people who posts less texts, that is the kind of analysis that this paper.
            • 34:00 - 34:30 This paper can be actually very useful in terms of even analyzing and even making some inferences on username changes. Also here is a table which actually also says few reasons for username changes. Privacy, for privacy since as my initials and part of my full name.
            • 34:30 - 35:00 So, is simply people are actually given the reasons for why people change the usernames privacy, privacy and abuse, link all accounts, use real name, use easier, shorter username and reading the text in the column one violates wiki policy, violates wiki policy, violates wiki policy for religious reasons.
            • 35:00 - 35:30 So, these are not from the user handles from Twitter. Authors actually got a chance to look at the username changes in Wikipedia and these are the reasons that people had actually mentioned even in Wikipedia you could actually change your handle. Of course this work, meaning all of these kind of work has to have some kind of a limitation. So, here due to the Twitter API restrictions only ten thousand users' data was actually
            • 35:30 - 36:00 collected and analyzed for the fifteen minutes scan, that is one of the biggest limitations for the study. And that are many directions that people could actually take this kind of work; one direction which users could take or people could take is actually extending, increasing the data
            • 36:00 - 36:30 set of the analysis itself studying it among much larger data set probably may give some more results which is generalizable to large audience also. With that I will stop this paper; I will see you soon.