What Do We Owe Students When We Collect Their Data – a response

It has been a few weeks since we issued our #DigCiz call for thoughts on the question “What do we owe students when we collect their data?” and there have been a few responses. The call is in conjunction with the interactive presentation at the EDUCAUSE Annual Conference that I’ll be helping to facilitate with Michael Berman, Sundi Richard, and George Station. The session will be focused around breakout discussions both onground and online during the session. We don’t necessarily have “answers” here – the session (and the call) are more about asking the questions and having discussion. The questions are too big for one session and often there are not easy answers; so we released the call early hoping that people would respond before (or after) the session. I’ve yet to respond to it myself so I’m going to attempt to do that in this post.

The #DigCiz Call

We want the call to be open to everyone – even those who don’t know a ton about student data collection and we want people to respond using the tools and mediums that they like. We have had some great examples already and I wanted to thank those who have responded so far. I threw the call out to some of our students at SNC and I was super honored that Erica Kalberer responded with an opinion piece. Erica does not study analytics, she is not a data scientist or even a computer science major. She didn’t do any research for her post and it is an off the cuff, direct, and raw response from a student perspective – which I love.

Additionally, Nate Angell chose to leave a hypothesis annotation on the call itself over at digciz.org.

Nate points out that there are many “we”s who are collecting student data and that students often have no idea who the players are that would want to collect their data let alone what data is being collected and what could be done with it. What do we mean when we ask “What do WE owe students….” Who is this we? Instructional designers may answer these questions very differently than accreditors would, or as librarians would, or even as students themselves would. I hope that by hearing from different constituencies that we can bring together some common elements of concern.

Framing Things Up

I am really intrigued by our question but I also have some issues with it.

The question is meant to provoke conversation and so in many ways it is purposefully vague and broad. It is not just “we” that could be picked out for further nuance. So many simple definitions could be picked out of this question. What is meant by “data” and more specifically “student data”.

What are we talking about here? Is this survey data? Click data from the LMS or other educational platforms. What about passive and pervasive collection that is more akin to what we are seeing from the advertising industry? The kind of stuff that does not just track clicks but tracks my where the cursor moves, the speed of how my cursor moves, where eyes are on a screen, text that has been typed into a form but has not been submitted. What about if we are using wearables or virtual reality? Does the data include biometric information like heart rate, perspiration, etc. Is this personally identifiable information or aggregate data? Some of these examples seem particularly sensitive to me and it seems like they should all be treated differently depending on context.

We could keep going on…. What is meant by “collect”, “students”, “owe”… a whole blog post could be written just about any one of these things.

Another of my issues is that the question assumes that student data will be collected in the first place. I’m setting that issue aside for this call and presentation because if I like it or not I am part of field that is collecting student data all of the time. As an instructional designer I make decisions to use technologies that often track data and to be honest if I wanted to avoid those technologies completely I’m not sure that I could. Over the course of my career faculty and administrators have often come to me asking to use technologies that collect data in ways that I consider predatory. How do I respond? How do I continue to work in this field without asking this question?

People who know me or follow my work know that over the last few years that I have often struggled with considering our responsibilities around student data. Even though I have been thinking about these kind of questions for a few years now I don’t think that I will be able to dive into all of the nuance that any of these could bring. (I want to write all the blogs – but time). So, I just have to resolve that – that is why this is a broader call for reflection and conversation and invite others to respond to the call around things that I may have overlooked.

Though I am still new to this conversation, I’m not so new or naive to think that there are not already established frameworks and policies for thinking about the ethical implications of student data collection. I’ve been aware of the work that JISC has been doing in this area for some time and had just started a deeper dive on some research when I attended the Open Education Conference in Niagara Falls a few weeks ago.

Somehow I missed that there were two important data presentations back to back and though I only caught about ¾’s of the Dangerous Data: The Ethics of Learning Analytics in the Age of Big Data presentation from Christina Colquhoun and Kathy Esmiller from Oklahoma State University, I got the slides for Billy Minke and Steel Wagstaff’s “Open” Education and Student Learning Data: Reflections on Big Data, Privacy, and Learning Platforms which I missed completely.

Both of these presentations looked at different policies and ethical frameworks around using student data which was a goldmine for me. Dangerous Data’s list did not make any claim about quality of the framework’s while the Open Education and Student Learning Data presentation did specifically state that their list was curated for policies that they were impressed by.

Open Education and Student Learning Data listed:

Dangerous Data listed:

My Response

I’ve started reading through the policies and frameworks listed above and while I have not had a chance to dive deep with each one of them, I’ve found a lot of overlap with what I have identified as four core tenets that I believe start answer the question “What do we owe students when we collect their data?” at least for me – for now. I’m personally identifying with “we”s as in instructional designers, college teachers, IT professionals, librarians (as an official wannabe librarian) and institutions – at least on some level.

I’m still learning myself and I could change my mind but for the purposes of this post I’m leaning on these four tenets. I feel like before we even start I need to say that there are times when considering these tenets, in practice, that the answers to the problems that inevitably arise come back as “well, that is not really practical” or “the people collecting the data themselves often don’t know that”. In these cases I suggest that we come back to the question “what do we owe students when we collect their data?” and propose that if we can’t give students what they are owed in collection that we think twice before collecting it in the first place.

I will list these tenets and then describe them a bit.

  • Consent
  • Transparency
  • Learning
  • Value

Consent

This one seems of the most importance to me and I was shocked to see that not all of the policies/frameworks listed above talk about it. I understand that consent is troubled, often because of transparency – more on that in a bit –  but it still strikes me that it needs to be part of the answer.

There is a tight relationship between ownership and consent; there is a need for consent because of ownership. If I own something then I need to give consent for someone else to handle it. But not all of these frameworks recognize that. The Ithaka S+R/Stanford CAROL project, listed above, talks about something called “shared understanding” where they basically envision that student data is not owned solely by the student but is a shared ownership between the school, the vendors, and third parties. In a recent EDUCAUSE Review article some of the framers of the project actually said “the presumption of individual data propriety is wishful thinking”. This, after they put the word “their” in scare quotes (“their” data) when referring to people being in a place of authority around the data about them. Ouch!

I mean I get what they are doing here. One looks at the Cambridge Analytica/Facebook scandal and says “oh how horrible” but their response is: you are a fool not to realize that it is happening all of the time. And maybe I am a fool but I still think it is horrible. The article points to big tech firms, how much data they already have about us, and how much money they have made with those data and uses it as a justification. But here is the thing, we are talking about students not everyday users. I think that makes a difference.

In another EDUCAUSE Review article Chris Gilliard points out the extractive nature of web platforms and the problems of using them with students. What of educational platforms? Is it really okay to import the same unethical issues that we have with public web platforms into our learning systems and environments? I’m comforted that most, if not all, of the other frameworks listed above and those that I’ve come across over the years do understand the importance of consent and ownership.

I’ve read broader criticisms of the notion of consent that I found quite persuasive by Helen Nissenbaum (Paywalled – sorry) but even she does not abandon consent completely. Rather she points out that consent alone, in and of itself, is not the answer. We need more than just consent – especially now when our culture grants consent so easily and thoughtlessly. Nissenbaum’s criticisms of consent are in thinking of it as a free pass into respectful data privacy. But here I’m thinking of consent in terms of what we owe students – I see it as a starting place and the least of what we owe them.

What do we owe students when we collect their data? We owe them the decency of asking for it and listening if they change their mind.

How we ask for data collection and and how we continue to inform students about how it is changing is not easy to answer and I want to be very careful of oversimplifying this complex issue. I think that, at least in part, it also an issue of my next tenet – transparency.

Transparency

Asking for consent is no good if you are not clear about what you are asking for consent to do and if you are not in communication about how your practices are changing and shifting over time. In the policies and frameworks it seems like transparency is sort of a given – even the guys over at Ithaka S+R/CAROL have this one. We need transparency in asking for consent around data collection as consent sort of implies “informed consent” and we can’t be informed without transparency.  But we also need ongoing transparency of the actual data and of how it is being used.

I found a blog post from Clint Lalonde published after the 2016 EDUCAUSE Annual that pretty much aligns with how I feel about it:

Students should have exactly the same view of their data within our systems that their faculty and institution has. Students have the right to know what data is being collected about them, why it is being collected about them, how that data will be used, what decisions are being made using that data, and how that black box that is analyzing them works. The algorithms need to be transparent to them as well. In short, we need to be developing ways to empower and educate our students into taking control of their own data and understanding how their data is being used for (and against) them. And if you can’t articulate the “for” part, then perhaps you shouldn’t be collecting the data.”

What do we owe students when we collect their data? We owe them a clear explanation of what we are doing with it.

But I actually think that Clint takes things a bit further than transparency at the end of that quote and it is there that I would like to break off a bit of nuance between transparency and learning for my third tenet.

Learning

Providing information is not providing understanding and while I can concede that in consumer technologies providing information for informed consent is enough, I think that we have an obligation to go further in education and especially in higher education. We have an obligation because these are students and they have come to us to learn. While they will learn from “content” they will learn a lot more from the experience of the life that they lead while they are with us. If that life is spent conforming and complying to data collection practices that they don’t understand and never comprehend the benefit of then, at best, they will graduate thinking all data collection is normal and they will be vulnerable to data collection practices from bad actors.

Of course this means that we ourselves need to better understand the data that we are collecting. It means that we need to know what is being collected and how it can be used ourselves before we start putting students through experiences where this is happening inside of a black box.

Inside of institutions we need to know what our vendors are doing. We need to create and articulate clear expectations about how we view the responsibilities of vendors around privacy and security. We need to vet their privacy and security policies and continue to check on them over time to see if any of those policies have changed. We need to build a culture of working with reputable companies. Then, we need to build that into the curriculum through increased digital, data, and web literacy expectations.

What do we owe students when we collect their data? We owe them an understanding, an education, about what their data are; what they mean; and what can be done with them.

Collectively, as teachers, librarians, instructional designers, administrators, product developers, institutions, etc. it seems that we will always have a leg up on this though – we will always be in a position of power over students. And so my final tenet has to do with the value of the outcome of data collection.

Value

Finally, if we are collecting student data I think that we should be doing if for reasons where we believe that the benefits to the student outweigh the potential costs to the student. This means putting the student first in the equation of what, when, why and how of student data collection.

I also need to be clear that I’m not talking about a license to forgo consent, transparency, and learning because it is believed that the best interest of the student are in intended. This is not an invitation to become paternalistic or to do whatever we want in the name of value.

My point being that the stakes are too high to be collecting student data for the heck of it, or because the system just does that and we are too busy to read the terms of service, or because someone is just wondering what we could do with it. If we have data we should be using the data to benefit students. If we are not using it we should have parameters around storage and yes even eventual deletion.

Collecting student data makes it possible to steal or exploit those data; while we can take precautions and implement security measures no data are as secure as data that were never collected in the first place and, to a lesser extent, data that were deleted. If we are going to collect student data then we have to do something of value with it. Having piles of data stored on systems that no one is doing anything with is wasteful and dangerous. If there is not a clear value in collecting data from students then it should not be collected. If student data has been collected and is not serving any purpose that is valuable to students and no one can envision a clear reason why it will hold value in the future then maybe we should discuss deleting it.

Amy Collier speaks to how data collection can particularly impact vulnerable students in Digital Sanctuary: Protection and Refuge on the Web? (at the end of which she presents seven strategies that you should also read – no really, go read them right now – I’ll wait). Collier starts with a quote from Mike Caulfield’s Can Higher Education Save the Web?

“Caulfield noted: “As the financial model of the web formed around the twin pillars of advertising and monetization of personal data, things went awry.” This has created an environment that puts students at risk with every click, every login. It disproportionately affects the most vulnerable students: undocumented students, students of color, LGBTQ+ students, and students who live in or on the edges of poverty. These students are prime targets for digital redlining: the misuse of data to exclude or exploit groups of people based on specific characteristics in their data.

What do we owe students when we collect their data? We owe them an acknowledgement and explanation that we are doing something that will bring value to them with those data.

Summation – Trust

Policy is great but I think taboo is stronger.

I can’t get that power difference out of my head. I mean it is like the whole business model of education – knowledge is power and we have more knowledge than you but if you come to us we can teach you. There is this trust to it; this assumption of care. We will teach you – not, we will take advantage of you. And to offer that with one hand and exploit or make vulnerable with the other – yeah…

I’ve been working in educational technology for fifteen years and when I first started there was very little that I heard about ethics. Security, sure – privacy… that was a thing of the past, right? It seems that we are starting to see some repercussions now that are making us pause and I’m hearing more and more about these things.

Still, I see these conversations happening in pockets and while I’m seeing lots of new faces there are ones that are consistently absent. I wonder about new hires just entering the field, especially those in schools with little funding, and what kind of exposure they are given to thinking about these implications. I wonder if a question like “what do we owe students when we collect their data?” ever even comes up for some of them.

There is a whole myriad of issues that are now coming to light around surveillance and data extraction. What is happening to trust in our communities and institutions as we try to figure all of this out? 

Perhaps more than anything, what we owe students when we collect their data is a relationship deserving of trust.

Don’t forget

So, don’t forget, the #DigCiz call is open for you to respond how you see fit. Share your creation/contribution on the #DigCiz tag on twitter or in the comments on the #DigCiz post.

We go live Friday, November 2nd at 10 AM Eastern Time with a twitter chat and a video call into the session. Please join us!

~~~

Thanks go out to Chris Gilliard, Doug Levin, Michael Berman, and George Station, all of whom offered feedback on various drafts of this post.

Photo by Taneli Lahtinen on Unsplash

Privacy and Security in DoOO: First attempt at student resource

It has been a wild few months and it feels like there are a lot of things happening at once.

I’m thrilled that at St. Norbert we have gotten our Domains project off of the ground and I’m talking about and working with domains more than ever – which is wonderful.

However, a few months ago after attending DigPed Lab, those of you who follow regularly will recall, I had some serious questions about how to design for privacy and security with DoOO.

I had some great collaboration around this from comments on the post to backchannel conversations about what all is out there. I would be remiss if I did not particularly give a shout out to Tim C from Muhlenberg College and Evelyn Helminen from Middlebury College who gave me lots of feedback and resources. And of course to Chris G who just keeps me thinking about privacy in edtech in general.

I’d had some visions of pulling together a group who is interested in this topic but I found that things just moved too quick for me and I was in need of a resource that I could give to students before I could pull the group together. So, still working on that – if you have a particular need for this please put a fire under me.

I’ve struggled with this topic because it is such a nuanced thing. I love DoOO because of the focus on student ownership and agency. Privacy can be addressed with blanket best-practices but that is not the conversation that I’m interests me.

I feel our domains project at SNC is particularly blessed in that we have our Tech Bar. We visited University of Mary Washington in building it and got a lot of tips from Martha Burtis and the students who work at the Digital Knowledge Center. I’m telling you all of this because I think it is important to contextualize this resource that I’ve built for students.

This first little resource around privacy and security with DoOO that I’ve built is directed at students and is really just meant to give them a taste of what is possible around naming, making pages private, securing sites, etc. I created a little infographic around this and at SNC I printed them up as large bookmarks. The SNC version clearly says that a student can visit the Tech Bar for more information.

I made a more generic version of the resource and slapped an open license on it in case it might be helpful for others with DoOO projects. I’m hoping to think about this more, collaborate with others, and have more thoughts on this as we move forward.

PrivacyAndSecurityStudentsDoOO

Download link

On a somewhat related note I do want to draw attention to our most recent DigCiz call for engagement which is a parallel project to an interactive presentation that we will give at EDUCAUSE Annual Conference.  The call for engagement and the presentation basically ask the question “What do we owe students when we collect their data?”. To participate in the call just blog or tweet (Nate Angell even started a hypothes.is annotation of the post). To participate in the presentation come to the EDUCAUSE Annual conference presentation or participate in the twitter chat. All details on the post.

Designing for Privacy with DoOO: Reflections after DPL

The thinking for this post comes on the tail end of Digital Pedagogy Lab (DPL) where, despite not being enrolled in any of the data or privacy offerings, concerns of student data and privacy rang loud in my ears. This came from various conversations but I think it really took off after Jade Davis’ keynote and after Chris G and Bill Fitzgerald visited us in Amy Collier’s Design track to talk about designing for privacy. After the Lab I also came across Matthew Cheney’s recent blog post How Public? Why Public? where he advocates for public work that is meaningful because it is done so in conjunction with private work and where students use both public and private as options depending on what meets the needs of varying circumstances.

A big part of what attracts me to Domain of One’s Own (DoOO) is this possibility of increased ownership and agency over technology and a somewhat romantic idea I have that this can transfer to inspire ownership and agency over learning. In considering ideas around privacy in DoOO it occurred to me that one of the most powerful things about DoOO is that is it has the capability of being radically publicly open but that being coerced into the open or even going open without careful thought is the exact opposite of ownership and agency.

In a recent twitter conversation with Kris Schaffer he referred to openness and privacy as two manifestations of agency. This struck me as sort of beautiful and also made me think harder about what we mean by agency, especially in learning and particularly in DoOO. I think that the real possibility of agency in DoOO starts from teaching students what is possible around the capabilities and constraints in digital environments. If we are really concerned about ownership and agency in DoOO then we have to consider how we will design for privacy when using it.

DoOO does allow for various forms and levels of privacy which are affected by deployment choices, technical settings, and pedagogical choices. I hear people talk about these possibilities and even throw out different mixes of these configurations from time to time but I have never seen those listed out as a technical document anywhere.

So, this is my design challenge. How can I look at the possibilities of privacy for DoOO, refine those possibilities for specific audiences (faculty and students), and then maybe make something that is not horribly boring (as technical documents can be) to convey the message. I do want to be clear that this post is not that – this post is my process in trying to build that and a public call for reflections on what it could look like or resources that may already exist. What I have so far is really just a first draft after doing some brainstorming with Tim C during some downtime at DPL.

Setting Some Boundaries
This could go in a lot of different directions so I’m setting some boundaries up front to keep a scope on things. I’d love to grow this idea but right now I’m starting small to get my head around it. I’m looking to create something digestible that outlines the different levels of privacy around a WordPress install on DoOO.  DoOO is so much bigger than just WordPress, I know that but I’m not trying to consider Omeka or other applications – yet. Also, I’m specifically thinking about this in terms of a class or other teaching/learning environment. A personal domain that someone is doing on their own outside of a teaching/learning environment is another matter with different, more personal, concerns.

Designing for Privacy with DoOO
Right now I’m dividing things up into two broad categories that interact with one another. I need better titles for them but what I’m calling Privacy Options are stand alone settings or approaches that can be implemented across any of the Deployments which are design and pedagogical choices that are made at the onset. Each of these also afford for and require different levels of digital skills and I’m also figuring out how to factor that into the mix. I will start with Deployments because I think that is where this starts in practice.

Deployments:
Deployment 1 – Instructor controlled blog: With this deployment an instructor has their own domain where they install WordPress and give the students author accounts (or whatever level privileges make sense for the course). Digital Skills: Instructor needs to be comfortable acting as a WordPress administrator including: theming and account creation. Students gain experience as WordPress authors and collaborating in a single digital space.

Deployment 2 – Instructor controlled multisite: With this deployment an instructor installs a WordPress multisite on their own domain and each student gets their own WordPress site. Digital Skills: Running a multisite is different from running a single install and will require a bit more in the way of a digital skill set including: enabling themes and plugins, setting up subdomains and/or directories. Students can gain the experience of being WordPress administrators rather than just authors but depending on the options chosen this can be diminished.

Deployment 3 – Student owned domains: This is what we often think of as DoOO. Each student does not just get a WordPress account or a WordPress site but their own domain. They can install any number of tools but of course the scope of this document (for now) is just WordPress. Digital Skills: One fear I have is that this kind of deployment can be instituted without the instructor having any digital skills. Support for digital skills will have to come from somewhere but if this is being provided for from some other area then the instructor does not need to have the skills themselves. Students will gain skills in c-panel, installing WordPress, deleting WordPress

Privacy Options
Privacy Options looks at approaches, settings, or plugins that can be used across any of the Deployments:

1 – Visibility settings: WordPress Posts and Pages have visibility settings for public, password protected, and private. These can be used by any author on any post and by admins on posts and pages.

2 – Private site plugin: Though I have not personally used a private site plugin I know that they exist and can be used to make a whole WordPress site private. Tim mentioned that he has used Hide My Site in the past with success.

3 – Pseudonyms: There is no reason that a full legal name needs to be used. How do we convey the importance of naming to students. I took a stab at this for my day job but I’m wondering what else can be done.

4 – Search engine visibility setting: This little tick box is located in WordPress under the reading settings and “discourages search engines from indexing the site” though it does say that it is up to the search engines to honor this request.

5 – Privacy protection at the domain level to obscure your name and address from a WhoIs lookup. Maybe not a concern if your institution is doing subdomains?

6 – An understanding of how posts and sites get promoted. Self promotion and promotion from others. How different audiences might get directed to your post or site.

Some Final Thoughts
There is one approach that I’d actually been leaning toward prior to Digital Pedagogy Lab that raises questions about how to introduce this. I do worry about the technical barrier that comes with learning about these privacy options. All of the privacy options come with some level of digital skill and/or literacy that needs to be in place or acquired. In addition, I think that often the deployments are made before the privacy options are considered; yes yes I know that is not ideal but it is a reality. Because of this, is it maybe just better to tell faculty and students, in the beginning at least, to think of their DoOO or their WordPress as a public space? Mistakes happen and are we muddying the waters by thinking of DoOO or WordPress as private spaces where a simple technical mistake could easily make things public? Most people have so many options for private reflection and drafting; from Google Docs to the LMS, email to private messaging we have so many tools that are not so radically publicly open. Is there something to be said for thinking of the domain space as public space and using it for that – at least while building the skills necessary to make it more private?

I don’t have the answers but I wanted to open the conversation and see what others are thinking. Are there resources that I’m missing and how can this be created in a way that will be easy to understand and digestible? I’m thinking and writing and booking some folks for conversations to keep thinking in this way. Stay tuned and I’ll keep learning transparently.

Big thanks to Tim C and Chris G for giving feedback on a draft of this post.

Photo original by me licensed CC-BY

Platform Literacy in a Time of Mass Gaslighting – Or – That Time I Asked Cambridge Analytica for My Data

Digital Citizenship and Curiosity 

In the beginning of 2017 I first discovered Cambridge Analytica (CA) through a series of videos that included a Sky News report, some of their own advertising, as well as a presentation by their CEO Alexander Nix. I found myself fascinated by the notion that big data firms, focused on political advertising, were behind those little facebook quizzes; that these data firms were creating profiles on people through harvesting their data from these quizzes and combining it with other information about them like basic demographics, voter and districting information, and who knows what else to create a product for advertisers. I was in the process of refining a syllabus for a class and creating an online community around digital citizenship so this was of particular interest to me.

My broad interest in digital citizenship is around our rights and responsibilities online and I was compelled by the thought that we could be persuaded to take some dumb quiz and then through taking that quiz our data would be taken and used in other ways that we never expected; in ways that would be outside of our best interests. 

I had questions about what we were agreeing to: how much data firms could know about us, what kind of metrics they were running on us, how the data could be shared, and what those messages of influence might look like. I started asking questions but when the answers started coming in I found myself paralyzed under the sheer weight of how much work it took to keep up with all of it not to mention the threats of financial blowback. This paralisis made me wonder about the feasibility of an everyday person to challenge this data collection, request their own data to better understand how they were being marketed to, and of course the security and privacy of the data.

Cambridge Analytica is again in the news with a whistleblower coming forward to give more details – including that the company was harvesting networked data (that is not just you but your friends’ data) from facebook itself (reactions, personal messages, etc,) and not just the data entered into the quizzes. Facebook has suspended the Cambridge Analytica’s accounts and distanced themselves from the company. Additionally, David Carroll, a professor from the New School Parson’s School of Design, filed a legal action this past week against the company in the UK. The story is just going crazy right now and every time I turn around there is something new.

However, much of this conversation is happening from the perspective of advertising technology (adtech), politics, and law. I’m interested in it from the perspective of education so I’d like to intersect the two.

The Request

A few weeks after I found those videos, featured by and featuring Cambridge Analytica, I came across a Motherboard article that gave some history of how the company was founded and how they were hired by several high profile political campaigns. Around this time I also found Paul-Olivier Dehaye of personaldata.io who was offering to help people understand how to apply to get a copy of their data from Cambridge Analytica based on the Data Protection Act (DPA), as the data was being processed in the UK.

My interests in digital citizenship and information/media/digital literacy had me wondering just how much data CA was collecting and what they were doing with it. Their own advertising made them sound pretty powerful but I was curious about what they had, how much of it I’d potentially given to them through taking stupid online quizzes, and what was possible if combined with other data and powerful algorithms.

The original request was not to Cambridge Analytica but rather to their parent company SCL Elections. There was a form that I had to fill out and a few days later I got another email stating that I had to submit even more information and GPB £10 payable in these very specific ways.

umm.edtech.fm/wp-content/uploads/sites/2/2018/03/Screenshot-2018-03-19-23.17.38.png”> Response from SCL asking for more information from me before they would process my Subject Access Request

[/caption]Out of all of this, I actually found the hardest part to be paying the £10. My bank would only wire transfer a minimum of £50 and SCL told me that my $USD check would have to match £10 exactly after factoring in the exchange rate the day they recieved it. I approached friends in the UK to see if they would write a check for me and I could pay them back. I had a trip to London planned and I considered dropping by their offices to give them cash, even though that was not one of the options listed. It seemed like silly barrier, that a large and powerful data firm could not accept a PayPal payment or something and would instead force me into overpayment or deny my request due to changes in the exchange rate. In the end, PersonalData.io paid for my request and I sent along the other information that SCL wanted.

Response

After I got the £10 worked out with Paul I heard from SCL pretty quickly saying that they were processing my request and then a few days later I got a letter and an excel spreadsheet from Cambridge Analytica that listed some of the data that they had on me.

It was not a lot of data, but I have administered several small learning platforms and one of the things that you learn after running a platform for awhile is that you don’t really need a lot of data on someone to make certain inferences about them. I also found the last tab of the spreadsheet to be disconcerting as this was the breakdown of my political beliefs. This ranking showed how important on a scale of 1-10 various political issues were to me but there was nothing that told me how that ranking was obtained.

Are these results on the last tab from a quiz that I took; when I just wanted to know my personality type or what Harry Potter Character I most resemble? Is this a ranking based on a collection and analysis of my own Facebook reactions (thumbs up, love, wow, sad, or anger) on my friend’s postings? Is this a collection and analysis of my own postings? I really have no way of knowing. According to the communication from CA it is these mysterious “third parties” who must be protected more than my data.

m/wp-content/uploads/sites/2/2018/03/Screenshot-2018-03-20-01.35.23.png”> Excerpt from the original response to the Subject Access request from Cambridge Analytica

[/caption]In looking to find answers to these questions Paul put me in touch with a Ravi Naik of ITN Solicitors who helped me to issue a response to CA asking for the rest of my data and more information about how these results were garnered about me. We never got a response that I can share and in considering my options and the potential for huge costs I could face it was just too overwhelming.

Is it okay to say I got scared here? Is it okay to say I chickened out and stepped away? Cause that is what I did. There are others who are more brave than me and I commend them. David Carroll, who I mentioned earlier just filed legal papers against CA, followed the same process that I did is still trying to crowdfund resources. I just didn’t have it in me.  Sorry democracy.

It kills me. I hope to find another way to contribute.

Platform Literacy and Gaslighting

So now it is a year later and the Cambridge Analytica story has hit and everyone is talking about it. I backed away from this case and asked Ravi to not file anything under my name months ago and yet here I am now releasing a bunch of it on my blog. What gives? Basically, I don’t have it in me to take on the financial risk but I still think that there is something to be learned from the process that I went through in terms of education. This story is huge right now but the dominant narrative is approaching it from the point of view of advertising, politics, and the law. I’m interested in this from the perspective of what I do – educational technology.

About a week ago educational researcher and social media scholar danah boyd delivered a keynote at the South by Southwest Education (SXSW Edu) conference where she was pushed back on the way we approach media literacy with a focus on critical thinking – specifically in teaching but this also has implications for scholarship. This talk drew a body of compelling criticism from several other prominent educators including Benjamin Doxtdator, Renee Hobbs, and Maha Bali which inspired boyd to counter with another post responding to the criticisms.

The part of boyd’s talk (and her response) that I find particularly compelling in terms of overlap with this Cambridge Analytica story is in the construct of gaslighting in media literacy.  boyd is not the first to use the term gaslighting in relation to our current situation with media but, again, often I see this presented from the perspective of adtech, law, or politics and not so much from the perspective of education.

If you don’t know what gaslighting is you can take a moment to look into it but basically it is a form of psychological abuse between people who are in close relationships or friendships. It involves an abuser who twists facts and manipulates another person by drawing on that close proximity and the knowledge that they hold about the victim’s personality and other intimate details. The abuser uses the personal knowledge that they have of the person to manipulate them by playing on their fears, wants, and attractions.

One of the criticisms of boyd’s talk, one that I’m sympathetic to, is around the lack of blame that she places on platforms. Often people underestimate what platforms are capable of and I don’t think that most people understand the potential of platforms to track, extract, collect, and report on your behaviour.

In her rebuttal to these criticisms, to which I am equally sympathetic, boyd states that she is well aware of the part that platforms play in this problem and that she has addressed that elsewhere. She states that is not the focus of this particular talk to address platforms and I’m okay with that – to a point. Too often we attack a critic (for some reason more often critics of technology) who is talking about a complex problem for not addressing every facet of that problem all at once. It is often just not possible to address every angle at the same time and sometimes we need to break it up into more digestible parts. I can give this one to boyd – that is until we start talking about gaslighting.

It is exactly this principle of platforms employing this idea of personalization, or intimate knowledge of who a person is, which makes the gaslighting metaphor work. We are taking this thing that is a description of a very personal kind of abuse and using it to describe a problem at mass scale. It is the idea that the platform has data which tells it bits about who you are and that there are customers (most often advertisers) out there who will pay for that knowledge. If we are going to bring gaslighting into the conversation then we have to address the ability of a platform to know what makes you like, love, laugh, wow, sad, and angry and use that knowledge against you.

We don’t give enough weight to what platforms take from us and how they often hide or own data from us and then sell it to third parties (users don’t want to see all that messy metadata…. Right?).  I’m not sure you even glimpse the possibilities if you are not in the admin position – and who gets that kind of opportunity?

It would be a stretch to call me a data scientist but I’ve built some kind of “platform literacy” after a little more than a decade of overseeing learning management systems (LMS) at small colleges but most people interact with platforms as a user not as an admin so they never get that. I’m not sure how to quantify my level of platform literacy but please understand that I’m no wiz kid – an LMS is no Facebook and in my case we are only talking about a few thousand users. I’m more concerned with making the thing work for professors and students than anything, however, in doing even a small amount of admin work you get a feel for what it means to consider and care about things on a different level: how accounts are created, how they interact with content and with other accounts, the way accounts leave traces through the content they contribute but also through their metadata, and how the platform is always monitoring this and how as an administrator you have access to that monitoring when the user (person) often does not.

I don’t think that most LMS admins (at least as LMSs are currently configured) at small colleges are incentivised to go digging for nuanced details in that monitoring unprompted. I do think that platform owners who have customers willing to pay large sums for advertising contracts have more of a motivation to analyze such things.

Educational researchers are incentivised to show greater returns on learning outcomes and the drum beat of personalized learning is ever present. But I gotta ask if can we pause for a second and think… is there something to be learned from all this Cambridge Analytica, Facebook, personalization, microtargeting, of advertising story for education? Look at everything that I went through to try to better understand the data trails that I’m leaving behind and I still don’t have the answers. Look at the consequences that we are now seeing from Facebook and Cambridge Analytica. The platforms that we use in education for learning are not exempt from this issue.

My mind goes back to all the times I’ve heard utopian dreams about making a learning system that is like a social media platform. All the times I’ve seen students who were told to use Facebook itself as a learning tool. So many times I’ve sat through vendor presentations around learning analytics and then during Q&A asked “where is the student interface – you know, so the student can see all of this for themselves” only to be told that was not a feature. All the times I’ve brainstormed the “next generation digital learning environment” only to hear someone say “can we build something like Facebook?” or “I use this other system because it is so much like Facebook”. I get it. Facebook gives you what you want and it feels good – and oh how powerful learning would be if it felt good. But I’m not sure that is learning is the thing.

In her rebuttal boyd says that one of the outstanding questions that she has after listening to the critics (and thanking them for their input) is how to teach across gaslighting. So, it is here where I will suggest that we have to bring platforms back into the conversation. I’m not sure how we talk about gaslighting in media without looking at how platforms manipulate the frequency and context with which media are presented to us – especially when that frequency and context is “personalized” and based on intimate knowledge of what makes us like, love, wow, sad, grrrr.

Teaching and learning around this is not about validating the truthfulness of a source or considering bias in the story. Teaching and learning around this is about understanding the how and why of the thing, the platform, that brings you the message. The how and why it is bringing it to you right now. The how and why of the message looking the way that it does. The how and why of a different message that might be coming to someone else at the same time. It is about the medium more than the message.

And if we are going to talk about how platforms can manipulate us through media we need to talk about how platforms can manipulate us and how some will call it learning. Because there is a lot of overlap here and personalization is attractive – no really, I mean it is really really pretty and it makes you want more. I have had people tell me that they want personalization because they want to see advertising for the things that they “need”. I tried to make the case that if they really needed it then advertising would not be necessary, but this fell flat.

Personalization in learning and advertising is enabled by platforms. Just as there are deep problems with personalization of advertising, we will find it is multiplied by tens of thousands when we apply it to learning. Utopian views that ignore the problems of platforms and personalization are only going to end up looking like what we are seeing now with Facebook and CA. The thing that I can’t shake is this feeling that the platform itself is the thing that we need more people to understand.

What if instead of building platforms that personalized pathways or personalized content we found a way to teach platform’s themselves so that students really understood what platforms were capable of collecting, producing, and contextualizing? What if we could find a way to build platform literacy within our learning systems so that students understood what platforms are capable of doing? Perhaps then when inside of social platforms people would not so easily give away their data and when they did they would have a better understanding of the scope. What if we were really transparent with the data that learning systems have about students and focused on making the student aware of the existence of their data and emphasised their ownership over their data? What if we taught data literacy to the student with their own data? If decades ago we would have focused on student agency and ownership over platforms and analytics I wonder if Cambridge Analytica would have even had a product to sell to political campaigns let alone ever been a big news story.

I’m not saying this would be a fail safe solution – solutions come with their own set of problems – but I think it could be a start. It would mean a change in the interfaces and structures of these systems but it would mean other things too. Changes in the way we make business decisions when choosing systems and changes in the way we design learning would have to be there too. But we have to start thinking and talking about platforms to even get started – because the way they are currently configured has consequences.

Image CC0 from Pixabay