Download: Episode 68.
This week we’re talking about the ethics of corporate research and how your data is used, Twitter’s developer API changes, how Amazon Prime Day went, and more.
Show Notes
- [01:08] Devopsdays Portland – SEPTEMBER 11-13, 2018 – RECOMPILERFRIENDS 20% discount
- [01:37] DevOpsDays ticket giveaway – enter by Aug 20!
- [01:59] The Recompiler Issue 8: Wildcard
- [03:43] Responsible Communication Style Guide reprint
- [04:19] New developer requirements to protect our platform
- [13:59] Dropbox still has questions to answer after claims of improper data sharing | ZDNet
- [15:43] A Study of Thousands of Dropbox Projects Reveals How Successful Teams Collaborate
- [23:44] How collaborating in Dropbox helps NICO advance scientific research
- [31:01] Pandora’s Checkbox – Emily St*
- [31:28] Private and secure multiparty histograms
- [35:23] Why Is Google Translate Spitting Out Sinister Religious Prophecies?
- [38:10] Amazon’s facial-recognition tool misidentified 28 lawmakers as people arrested for a crime, study finds – The Washington Post
- [41:41] The Motherboard Guide to Amazon Prime Day’s Best Deals
- [44:23] Amazon warehouse workers are striking across Europe on Prime Day
- [44:33] Muslim Amazon Employees Protest Increased Workload During Ramadan | Observer
- [45:15] The Hidden Environmental Cost of Amazon Prime’s Free, Fast Shipping
- [45:22] I’m Starting to Have Serious Doubts About Amazon Prime
- [53:24] #124 The Magic Store by Reply All from Gimlet Media
- [57:19] Lina Khan and the “Hipster Antitrust” Movement – The Atlantic
- [59:25] Academic writes 270 Wikipedia pages in a year to get female scientists noticed
- [1:00:35] The Library Music Project Will Surprise and Delight Your Ears – Music – Portland Mercury
Now Broadcasting LIVE most Fridays
We broadcast our episode recordings LIVE on most Fridays at 12pm PT. Mark your calendars and visit recompilermag.live to tune-in.
We love hearing from you! Feedback, comments, questions…
We’d love hearing from you, so get in touch!
You can leave a comment on this post, tweet to @recompilermagor our host @christi3k, or send an email to podcast@recompilermag.com.
Transcript
CHRISTIE: Hello and welcome to The Recompiler, a feminist hacker podcast where we talk about technology in a fun and playful way. I’m your host, Christie Koehler.
3…2…1…it’s connecting, alright. We should be on air.
AUDREY: Great. Hello.
CHRISTIE: Hey, Audrey. Hey, everyone. It’s Friday, July 27th, about noon Pacific Time. We’re doing our live broadcast and recording of Episode 68 of The Recompiler podcast. This week, Audrey and I are going to talk about the ethics of corporate research and how your data is used, Twitter’s developer API changes, how Amazon Prime Day went, and some other stuff. But first, we got some announcements. What have you got, Audrey?
AUDREY: All right. So The Recompiler is a community sponsor for DevOpsDays Portland. It’s happening September 11th through 13th. The DevOpsDays events are a series of technical conferences that talk about software development, IT infrastructure ops and their interrelatedness, as well as how people do that work. We have a ticket code: RECOMPILERFRIENDS will get you 20% off your ticket. We’re also doing a ticket giveaway. We have one ticket to give to a reader or a listener and you can find a link to put your name in for that in the show notes.
CHRISTIE: Awesome.
AUDREY: And I have my ticket now. Do you?
CHRISTIE: Yup.
AUDREY: Awesome.
CHRISTIE: I shall be there. And what else have you got? Issue 8?
AUDREY: Issue 8 is on the website. All of the articles are there and free to read. We have a comic about being nonbinary and trans in tech. We have an article about cell phone networks. I never actually opened this and put it in front of me when I start talking. I’m always just like, “Oh, wait. Which issue…” because we have multiple ones going at the same time. Wait, I’m going to grab it. Never mind. I don’t have another box next to me. Well anyhow, there’s a bunch of great stuff. It was our Wildcard issue, so it’s a little bit of everything. And the articles are online for you to read.
CHRISTIE: So we got Finding a Path with the Fibonacci Sequence, Programming While Trans, Getting Started with UI Templating, Build Your Interviewing Strategy, Demystifying Cellular Communication: A Gentle Introduction to Cellular Networks. I love this picture because it’s a palm tree that is clearly actually a cell tower.
AUDREY: Those are my favorite.
CHRISTIE: Yeah. Everyone’s Sky: Citizen Scientists’ Role in Observing Tabby’s Star, and then I’ve Forgotten My Name – Can I have Yours? Good stuff.
AUDREY: It’s really a great issue.
CHRISTIE: And I think you have one more. Oh, so Recompilermag.com is where you can go to get that. And then there are still print copies available?
AUDREY: There are. There’s still a few.
CHRISTIE: Shop.recompilermag.com. And getting a subscription is really a great way to support the magazine and the podcast.
AUDREY: And then our last thing is that we are reprinting The Responsible Communication Style Guide. I know people have been waiting for this for a while. We ran out sometime last Fall, I think, of our copies from the first printing. So it’s at the printer. I’m going to go in and tell them to go ahead and print it. And then we’ll be getting it in everybody’s hands very shortly.
CHRISTIE: Woohoo!
AUDREY: Yup. We’re going to print a little bit over the preorders but the best way to make sure that you get a copy is still to put in your order right now.
CHRISTIE: Shop.Recompilermag.com. So, Twitter’s making some API changes.
AUDREY: Apparently.
CHRISTIE: Specifically related to developer access to the API. So if you want to write some program that interfaces with the Twitter API, this would affect you. There’s a new application process to get your developer account in the first place. And so now when you apply, it says, “All developers will be required to provide detailed information about how they use or intend to use Twitter’s API so that we can better ensure compliance with our policies.” I started the process and I put a screenshot here on our show notes. But after you say who you are, it says, “Tell us about your project.” And then you say which use cases are you interested in and there’s checkboxes and it’s things like academic research, advertising, audience analysis, chatbots, animation et cetera, et cetera. And then you have to provide a text description of what you’ll be doing. And then there’s a radio button for ‘will your product service or analysis make Twitter content or derived information available to a government entity or an entity who serves government entities’.
AUDREY: Interesting. I mean, that could be a lot of researchers too.
CHRISTIE: Yeah.
AUDREY: Not just what we think of as normal governmental use.
CHRISTIE: They will be limiting the default number of apps a single developer can have registered to 10 with the ability to request more. And then they’re changing the rate limiting, the default rate limiting. So it’s things like tweets and retweets and this is per app. And there’s also, I guess, individual develop user limits too which I didn’t look up. But tweets and retweets 300 per three hours, likes a thousand per 24 hours, follows a thousand per 24 hours, direct message is 15k per 24 hours. If you’re talking like an artbot, that seemed pretty reasonable.
AUDREY: But if you’re talking about app like a client that’s quite restrictive.
CHRISTIE: Yeah or a bot that looks for certain keywords and replies to people or whatever, that could. So already, I’m seeing reactions from people who I know to be hosting funny bots that are basically like ‘I don’t have time for this shit’.
AUDREY: That was how you found out about it, right? That you started to see some bot developers talking about it.
CHRISTIE: I don’t remember if that surfaced before the actual notice from Twitter or not. It’s very possible, yeah.
AUDREY: Have you gone through the developer application process before?
CHRISTIE: Yeah, cause for almost anything, like I’ve done it in conjunction with self-hosted WordPress before and then I think I had to do it when I had the ThinkUp app that doesn’t exist anymore. So, in that sort of things. I haven’t written a bot or anything.
AUDREY: I used to do a lot of social media integration on web apps. And so for staging, I would use my own account. One that I no longer have access to which is sort of funny. If you set up like your first initial last name on Twitter and then it’s attached to an email address of a workplace that you no longer work at, Twitter doesn’t have a good mechanism for you to say, “No really, there’s only one of me. I would like to have that account or at least that username back.” So I would have a lot of different apps set up on my account for staging purposes so that I could test some social media integration. You get into these weird workflows whenever these kinds of restrictions happen where you have to do things that are just increasingly awkward to work within the rules. And if they’re thinking about spammy and bad behavior then that friction is fine. But for app developers who have actual customers who like what they’re doing, it’s a little bit more frustrating. And I think that there’s been a feeling for a while that Twitter’s API has gone from this really easy to connect with, very open thing, to something that’s a lot more unpredictable, limited, hard for external developers to work with. Especially for people who develop third party clients for accessing Twitter, I think the situation’s gotten worse and worse over time.
CHRISTIE: The post from Twitter also said that they’re adding a new ability to build a report in app. And the first thing I thought of was because Twitter used to tell you what client a tweet was posted from and they removed that some time ago.
AUDREY: They may still have the data.
CHRISTIE: Yeah, but as a user, how would I…it just didn’t seem very actionable. It just didn’t make sense to me. I was bummed when they took that away because it was a way to learn about different ways to interact with Twitter. I’m like, “Oh, everyone’s using this Twitter client. I’ll check that out.” It’s interesting that by taking that away, I actually think they took away some of the ability to make informed critique about the tweets you’re seeing.
AUDREY: That makes sense. And I was thinking about an earlier episode we did about tweet decking and how having just the client information would help identify that behavior a little bit for people who are seeing it initially. I knew it also maybe caused them to mentally flag some stuff that wasn’t part of that kind of spamming. But still, it would be more information that you’d have to understand what you’re seeing.
CHRISTIE: You remind me, I’m going to TweetDeck to…okay, does TweetDeck still do it? Let me look. Yes. So TweetDeck, it says Twitter for iPhone, Twitter for Android. I’m just clicking on some random…Twitter Web Client. Okay, so TweetDeck shows this.
AUDREY: So then it is in the data that they’re sending through the API?
CHRISTIE: I need to remember that this TweetDeck interface still works like this because I keep forgetting it exists. You know what’s really funny is I used to really like this narrow column format. But then I got really unused to it.
AUDREY: Sometimes even subtle differences in UI can have a huge impact on how we perceive things and how we interact with them.
CHRISTIE: Yeah.
AUDREY: Even something that seems really simple like the width of the column layout can affect your reading comprehension, how you put pieces of information together, how you assess it. It’s not a neutral kind of thing. And I think it makes sense that Twitter is making these kinds of changes with the API. I think there’s a lot of fun stuff that won’t be worth the effort like you’re seeing bot developers say. I have, I think, maybe five Twitter bots that are active and I haven’t looked at them in two years. So it’s still kind of sometimes entertaining what they’re doing but I wouldn’t put any effort toward keeping them online either.
CHRISTIE: Right. Anything else with the Twitter API changes?
AUDREY: No, I don’t know. It seems like another one of those examples of how we have these big platforms that are controlled by a single entity. And so what’s good for them has different impacts on all of us. They can make a system-wide decision that makes a lot of sense for the system but that doesn’t mean that it enhances the usability for everyone. And we just have so little control over that.
CHRISTIE: And I would argue that most companies producing software or some kind of software web services, I don’t think usability is their primary concern. Like there are just so many ubiquitous things that I find subpar in terms of quality.
AUDREY: Because they pay for a lot of software services as a business now, I definitely have my own personal ranking of who I think has a good UI, who I think listens to their customers well where I get good support. There are companies that I think are doing really good and companies where I’m like, “If only your competitors were better, I would switch.”.
CHRISTIE: Speaking of which…
AUDREY: Oh my God, should we switch?
CHRISTIE: Yeah. Dropbox.
AUDREY: Yeah.
CHRISTIE: And I went on a different rabbit hole after we did our content meeting yesterday, Audrey, because I’ve got kind of a theory to share with you.
AUDREY: Yes, so we had kind of at least two different areas that we had thought about in terms of this research study that Dropbox, I don’t know, has sponsored in a way to look at collaboration and academic performance, like research performance.
CHRISTIE: The headline was something like how successful teams collaborate.
AUDREY: Which is ridiculously broad given what they actually looked at and…I don’t know, we’ll get into this. There’s some stuff about the research where I just…
CHRISTIE: Yes, that’s what I wanted to talk more about.
AUDREY: This is ever going to make it through peer review.
CHRISTIE: I don’t think there’s actual research behind it. I don’t think there’s a paper behind it. That was my new theory.
AUDREY: Oh, man. That would explain that would explain….so, do we want to give some background here?
CHRISTIE: Yeah. First let’s explain what we’re talking about because not everyone sits on the news like we do.
AUDREY: Okay.
CHRISTIE: Dropbox is a file sharing service. And they gave a data set to a research unit at Northwestern that studies complex systems or something. And these researchers did some analysis on it and about collaboration and then somehow this piece for Harvard Business Review got written that basically it’s really like there is very little substance in this. And they looked at sort of the folder structure. They looked at how many collaborators were on a shared folder, how much they collaborated. They derived some information about the relative seniority of the collaborators whether they were more senior in their career or more junior, the methodology with which they did that I think is questionable. And then came up with some conclusions. And so, the conclusions were go small.
AUDREY: And they looked at the ranking of the research institutions too.
CHRISTIE: Yes.
AUDREY: The departments.
CHRISTIE: They compared the top 10 universities with the bottom 10.
AUDREY: From a very specific ranking. They did not compare against different methods of ranking these organizations. And similarly for when they looked at the research, the citations sent publication histories. I think that they pulled that from a single data pool.
CHRISTIE: And the way they determined seniority was by number of citations.
AUDREY: They had some kind of cut off that they picked.
CHRISTIE: Maybe there’s a really good proxies. I don’t know, but it seemed questionable to me. So they go small, take your time, increase same team collaborations, aim for equality, embrace experience. The other thing that popped out of me just [inaudible] conclusions was that the variance is not huge, like the average number. So their ‘go small’ came from a difference of .7 persons between the top 10% universities and the bottom 10%.
AUDREY: So this is when I started to think that the whole thing had been cooked because I started looking at the numbers and the Dropbox thing doesn’t give any numbers, like their blog post. But I started looking at the HBR article and the numbers that they were sharing and I thought this might not be statistically significant at all and without the full research, like a full research paper where they explain their modeling methodology and their calculations. We have no way of verifying that.
CHRISTIE: Right.
AUDREY: Sorry, I don’t need to laugh straight into the microphone. But this just really struck me as like the ways that social research can be totally cooked and not very meaningful at all.
CHRISTIE: The other problem is that there’s no links to the paper. And at the bottom it says, “It’s based on a scientific paper with lead author.” Oh, they’ve updated that. “If you’d like a copy, email the authors,” which I’m pretty sure that they’re not answering their email right now because of the backlash this received. So I went looking for preprints. I couldn’t find any. So I don’t know what stage this is at. And in fact, I think I’ll make a calendar reminder in like 3 and 6 months to go look for more preprints to see if I can find this.
AUDREY: Who knows, maybe they cleaned it up at that point.
CHRISTIE: They might have but also I wonder if…I think this is either in a very early stage or it’s just an exploration they did and there’s never going to be a paper about it. I don’t know.
AUDREY: Yeah.
CHRISTIE: But I could be wrong. I would really like to read the paper because I have all kinds of questions about it.
AUDREY: Yeah, me too. And I think we both separately tweeted asking if anybody had gotten their hands on it. I didn’t get any responses.
CHRISTIE: Oh, I didn’t realize you tweeted about it. I really want to know people who are in academia, what is this…and who know more about the cycle of research and papers and then these kinds of pop science writings, where do they fit in? Like at what point do you write this sort of more watered down version in that cycle? And do you ever do that when there isn’t a paper?
AUDREY: I don’t know. That part seems pretty weird to me that there wouldn’t at least be a draft paper or preprint or something to look at this point. And I wondered a lot about the editorial process here, like how something gets into the Harvard Business Review.
CHRISTIE: The other thing I noticed is that one of the authors of the HBR thing is a manager of Enterprise Insights at Dropbox. So the more I started looking this, the more this screamed marketing whitepaper to me. And then I noticed that the research unit that did this is actually a Dropbox customer. There’s a piece from a couple months ago that talks about how they’re using Dropbox.
AUDREY: So maybe this is some just weird funding thing that happened where they’re using Dropbox [inaudible] suggested or somebody in there suggested a partnership. And so Dropbox is effectively sponsoring this research, getting it out there. It makes Dropbox look good if they know something about how people use their product, how they collaborate and how you could collaborate better using it.
CHRISTIE: Right.
AUDREY: That’s totally a marketing thing. But there were so many things in here where I was like, “How did you come to that conclusion and know that you can say it,” because they’re proxying a lot of the things. They’re taking a measure to say, like you said about this and you already cut off. They’re making decisions about that. They are giving things like just that first ‘go small’. It says, “The average number of people on a project at a top-10% university was 2.3; at a bottom-10% institution it was 3.0.
CHRISTIE: Yeah.
AUDREY: That’s so little. If it was like 2 versus 5, I would start to think, “Oh okay, maybe they’re on to something.” But it’s so small which means that I would want to see some certainty around that, some statistical certainty that are not communicating here.
CHRISTIE: The other thing is that there’s no talk or discussion about the type of work that is done at Dropbox versus other tools. And so, I think you have to have an understanding about the type of work that happens to understand what these different frequencies mean.
AUDREY: Speaking to that, they may not all be using Dropbox for the same reason in the same way.
CHRISTIE: Right.
AUDREY: And for all we know, the top-10% universities have completely different policies about how Dropbox is used than the bottom-10% universities.
CHRISTIE: You could have two people sitting next to each other working on a paper and then they just save it to Dropbox when they’re done. Or you could have them…it could just be like…yeah, you just don’t know.
AUDREY: Dropboxes where the pre-prints go and not the things that are fully drafts.
CHRISTIE: So that also bother me about it. It just feels very advertorial to me. And then I just noticed this because I was looking at this…this wasn’t even that long ago. In June, they wrote how collaborating Dropbox helps NICO advance scientific research. And I scrolled all the way to the bottom and there’s ‘to learn more about the way technology is changing collaboration, tune in to our upcoming webinar led by Brian Uzzi, the author of the “paper” and Dropbox Customer Insight Manager, Rebecca Hinds’. And then I try to click on the webinar but it takes me to a form to get a book. I was hoping there’d be a recording of the webinar. This is not research; this is cross marketing.
AUDREY: Sure, yeah. I don’t know. I took a couple of statistics and research classes in college which admittedly was awhile ago and this is kind of tying into the privacy stuff that I think we want to talk about. One of the things that I remember about how to lie with statistics is that the way that you cluster data matters, the way that you set your cut offs matters, and that you can change your results quite a bit when you start tinkering with those sorts of things.
CHRISTIE: Yeah.
AUDREY: And so what Dropbox says that they did to protect the privacy of researchers whose data that they were sharing was that they grouped it in some way. They clustered the data so that it couldn’t be linked back to specific histories.
CHRISTIE: I thought that they also really harped on that and on anonymizing.
AUDREY: That they didn’t list the names of the users or the names of the institutions, any of that kind of stuff.
CHRISTIE: I was trying to get the sense that maybe they had hashed the names of the folders or something to obscure them or given them a first level data analysis. But it’s not clear, we don’t know.
AUDREY: Without hearing something about…and in a paper, you would say what statistical methods you used and what your computations looked like. This ZDNet article says while the researchers initially claim Dropbox gave them raw data which they anonymized, the report was updated after Monday because Dropbox had anonymized the data before handing it over. And I thought rendering any identifying user information permanently indecipherable. There is something else that made me think that they had done some pre-analysis basically.
CHRISTIE: The thing that was added to the bottom of the HBR article and I don’t know when it was added, so you may or may not have seen. It says, “Dropbox anonymized and aggregated the data before providing it for the analysis. Also, before anonymizing the data, Dropbox linked their data to researchers’ publication data from the Web of Science resulting in a final data set of about 16,000 researchers.” So Dropbox did some of the data…to me, that’s not necessary…like that methodology needs to be included in any research findings about it.
AUDREY: Because the researchers even if we’re having some kind of clean handoff where they aren’t receiving specific histories but some kind of network analysis, that’s still a research decision. And if the research team isn’t in on it, then it may be hard for them to know what the impact of those choices are.
CHRISTIE: And we’re kind of harping on the research methodology part of it but what it blew up in the news was there is the sort of privacy aspect of it.
AUDREY: I felt the need to talk about the research methodology because I haven’t seen it get called out very much. And I want us all to be pretty skeptical when we hear these kind of blanket research results that don’t give us an ability to reason about whether those are good results or not.
CHRISTIE: And unfortunately because there’s so much gatekeeping around the results of scientific research, we’re trained to just accept these third party…this is not third party in this case, but we’re trained to accept these watered down summaries of it. And there’s very few…I mean, pre-print services are becoming more and more common but that’s up to the authors to submit their pre-prints. So, it’s frustrating to me.
AUDREY: And we have an article…so [inaudible] told us the next Recompiler issue is about science and we have an article about pre-print specifically because they are such a big part of the process of making science open to people to understand it and make sense of it.
CHRISTIE: Nice. I’m curious because I looked at archives, social archive, and research gate. Those are the three places I looked and I just did some general googling but I don’t know if I have an inclusive list of all the places to search for pre-prints.
AUDREY: It may depend a little bit on the specialty.
CHRISTIE: Yeah.
AUDREY: The data privacy part of this, users didn’t know that somebody was going to look at the way that they use Dropbox and analyze it and potentially gain their own professional credibility from doing that with their data and to expose potential insights without the users being aware of it.
CHRISTIE: And Dropbox’s response was, “We didn’t give them any personal information. And even if we did, we’re allowed to do that on our terms of service.”
AUDREY: Sure. Your terms of service probably allows you to do a lot of things. That doesn’t mean that they’re ethical or good.
CHRISTIE: So there’s a gap there. And we come back to this thing where it’s really not possible for users to grant informed consent to this stuff because the power dynamics are so lopsided.
AUDREY: This is another article in that upcoming issue about the way that there isn’t an institutional review board type entity that monitors corporate research that makes them actually do an ethical analysis before they get going.
CHRISTIE: Right. We had some sort of ancillary things that came up in relation to this. There’s this post from Emily about just how much of our data sort of brokered between different third parties when we sign up for a service.
AUDREY: And just in the course of doing business, she focuses on the financial side of it, fraud prevention. But just in the course of doing business, the ways that third party sources gain and aggregate data.
CHRISTIE: And then there’s this post from Jamey Sharp about ways to collect statistical data that is privacy conscious, which is a very detailed post and has lots of links to the source research papers which I really appreciated after having none of that in the HBR article. It would be good for a technical deep dive.
AUDREY: For sure. I think I read about the first third of Jamey’s post. And I really liked just that starting idea about how clustering could aid anonymity. I think if Dropbox had used the kinds of techniques that Jamey is talking about, then they would have probably boasted of it. But that those would be very reasonable upfront analysis steps to take.
CHRISTIE: Although I did the part…you may not have gone to this yet but there’s a part that talks about multiparty…I’m not sure what to call it…multiple parties participate in the protocol. And I think that the collection of data that has to be considered in the collection of data upfront, so I don’t think that’s a technique you could apply after data has been collected. I barely understood what I was reading. But the sense that I got from it was that you would have to design the research specifically for that and that it might require some pretty big changes to how data is collected.
AUDREY: Interesting.
CHRISTIE: But that’s just intuition.
AUDREY: But you could include that in the architecture of your data collection then.
CHRISTIE: Right. And it would be a much more participatory model. And it reminded me of some of the projects you talked about that you learned from Allied Media Conference which I was just editing that episode.
AUDREY: Your data body stuff.
CHRISTIE: I think so, yeah.
AUDREY: Giving people ways to start to examine how their data is being [inaudible].
CHRISTIE: Yeah.
AUDREY: Who knows? Some people probably would opt in to Dropbox research about collaboration if they understood that their data could be protected. I doubt that anybody involved in this project had the faintest idea how it was going to blow up.
CHRISTIE: No, clearly not because they had to scramble to update all their stuff.
AUDREY: Right.
CHRISTIE: Though the one from Dropbox had to delete her social media accounts.
AUDREY: Yeah. I mean, that sucks. I hope that wasn’t a gross harassment happening. But of course, there’d be pushed back. The way that Dropbox bragged about it, what they were thinking was a selling point actually was a thing that people found really offensive, the scope of it. They’re like, “Look at how many people we were able to look at.” And everyone who might have been affected was like, “Look at how many [inaudible] you were looking at.” It was not taken in the way that I think the researchers anticipated. I don’t know. I just want to end with if they were actually publishing this in a peer reviewed journal, there are lots of ways that those can fail but it would have had at least some of the same scrutiny before they put it out in public.
CHRISTIE: Right. Yes, that’s what I want to look in a little while and see if a paper does come out of it.
AUDREY: Or if they just delete the whole thing and walk away.
CHRISTIE: So Google Translate has been up to some shenanigans.
AUDREY: For a while, it sounds like.
CHRISTIE: Basically, I don’t know how people [inaudible] but if you put in some sort of nonsense into Google translate and set it to certain languages to translate into English, you get Bible verses.
AUDREY: Or other similarly arcane pieces of text.
CHRISTIE: The theory is that Google’s been using certain religious texts like the Bible to train its neural network for doing translation. And I could see why they’re doing that because the Bible I think has probably been translated into every language.
AUDREY: Every language that it possibly could be. And for the kind of translation, the data that they’re looking for, they need a piece of text that’s been translated into as many different languages as possible because what they’re asking it is if we take this chapter and verse, figure out how it’s like this other chapter and verse in a different language. Find the connection between those and build its analysis. And that’s using this neural network training approach to doing that.
CHRISTIE: One of the researchers Vice talked to while they were trying to figure this out pointed out that the languages that it seems to be doing it with are ones that have less fewer training sets. So they’re, for whatever reason, not as in wide use or just…I don’t know. And this is what really jumped out at me. It says they’re quoting this person named Rush. I didn’t grab what his title was. But, “The models are black-boxes that are learned from as many training instances that you can find. The vast majority of these will look like human language, and when you give it a new one it is trained to produce something, at all costs, that also looks like human language. However if you give it something very different, the best translation will be something still fluent, but not at all connected to the input.”
AUDREY: Failing to have appropriate answer to it, it still generates an answer.
CHRISTIE: And this is hugely problematic when you also think about image recognition and all the other things that they’re using machine learning and neural networks to do.
AUDREY: There is just that thing from the ACLU this week about using Amazon’s recognition system. They fed it a bunch of mug shots and then they fed it a bunch of congress people and they asked it to identify the congress people.
CHRISTIE: Yes.
AUDREY: And it got people of color especially wrong.
CHRISTIE: So, I don’t know if that’s the same type of category error but it’s that kind of thing I was thinking about. It’s like if you’re trying to get a computer a neural network to do work for you, if it doesn’t have a good answer, you don’t really want an answer like in a lot of cases other than an answer that is not connected at all to the input.
AUDREY: You wanted to say, “I have no idea what that is,” rather than doing something absurd with it. I mean, I enjoy the idea that there’s a way to kind of make that training set pop out by doing something very silly with it. We have so few ways to look into these boxes.
CHRISTIE: I just feel like we already have enough black boxes in our just daily existence just like figuring out how the universe is and why things happen and whatnot. We’re just adding layers and layers of technology that create these absurd interactions. And it’s madness.
AUDREY: Yesterday when I was looking at this article and looking at some of the examples, I went and found the Reddit that I think the Vice article had mentioned and it’s called something like translation gate. I looked at the…there’s just lots of screen caps of people saying, “Oh, I tried this and this is what I got.” And I looked at what they were doing and in a lot of cases, it’s repeating a syllable over and over again or repeating a pair of words over and over again. And I thought it was really interesting that the repetition seems to be a big part of it. They’re generating like a phrase length that has to be long enough for it to pull out this chunk of text.
CHRISTIE: So it’s that particular pattern that triggers this edge case.
AUDREY: Yeah. And there’s some very silly conspiracy theory stuff going on out there about people looking at this too.
CHRISTIE: Of course.
AUDREY: But just seeing the way that people are playing with it was really interesting.
CHRISTIE: I think it also exposes another way with computing where I think we sort of have been trained to think of computers as providing knowable definite answers. And there’s so much of computing that’s not actually true. We can’t prove that a computer program is going to do a certain thing. There’s certain unsolvable problems in computing.
AUDREY: And there are ways to make computer programs that are more provable than most of the software that we use but at the stage that we get to it, we see it.
CHRISTIE: And then when you think of also things like Spectre.
Amazon?
AUDREY: Sure.
CHRISTIE: Amazon had its holiday.
AUDREY: With balloons and party hats?
CHRISTIE: I at least wanted it to be in a prime number day of the month, but it wasn’t. It was on the 16th.
AUDREY: Yup, the 16th.
CHRISTIE: Do people not know what Prime…so, Prime Day is a day that Amazon made up to put a bunch of shit on sale.
AUDREY: It’s like a Black Friday type thing.
CHRISTIE: Yeah, in July. They’ve been doing it in a couple of years, at least. And in the lead up to this year’s Prime Day, a bunch of Amazon workers called for strike. Have you seen much information about how that went?
AUDREY: I haven’t, no. I saw a fair amount of the lead up to it about which teams of workers, which warehouses, which unions might be involved but not very much that talked about the afterward. Just that Amazon was definitely making an effort to claim that it had no negative impact whatsoever, like it could not hurt us.
CHRISTIE: Despite the fact that there were major issues at the site throughout the day.
AUDREY: Right. And I only know about that because they were giving everybody [inaudible] error messages.
CHRISTIE: Right.
AUDREY: I kept seeing screenshots.
CHRISTIE: I would not have clicked on this Motherboard article because it says the Motherboard guide to Amazon Prime Day’s best deals and the whole thing is a giant troll.
AUDREY: I only clicked on it because somebody I trust who had shared the strike information before we tweeted it, and I was like, “Well, if they’re sharing it, there’s going to be something good here.”
CHRISTIE: Right. So they say: Best Deals on Being an All-Consuming Mega Retailer That Eats Up Small Businesses. And it’s a very self-serving troll because all the links are to the previous article. Oh no, it’s not just links to their own articles. They link externally, too.
AUDREY: But it’s quite a catalog of criticisms of Amazon that they’ve put together for this.
CHRISTIE: It is. Best Deals on Building a Creeping Surveillance State, Best Deals on Reducing Human Labor to Algorithmic Profit, Top Reviews from People Who Worked There.
AUDREY: Your post gave me a good laugh last week.
CHRISTIE: If you’re a good Amazonian, you become an Amabot. Of course, I think most of the striking stuff was focused on Europe because we’re pretty much destroyed unions here in the United States.
AUDREY: Although wasn’t there a group in Michigan maybe, there was a group of Amazon warehouse workers in the US that had done their labor action recently.
CHRISTIE: Yeah.
AUDREY: They got just a little bit of press. Maybe I can dig that up for the notes, too. They were asking for specific accommodations during Ramadan, I think. And so they…I forget what exactly they had done to pressure Amazon for that. But yeah, it got some press.
CHRISTIE: So we just kind of got a…in honor of Prime [inaudible] not just a handful of sort of different aspects of things to talk about like the hidden environmental cost of Amazon Prime’s free fast shipping. I’m starting to have serious doubts about Amazon Prime.
AUDREY: Which is sort of about how it affects your spending and whether it’s forcing you to make different decisions than you would have otherwise.
CHRISTIE: Both articles sort of talk about the impact of Amazon Prime. It’s been wildly successful for Amazon. They keep raising the price. Prime customers spend more money and they make more shipments. What am I trying say? They don’t wait to buy a bunch of stuff at Amazon so that it’ll come all shipped at once, they buy things piecemeal. And so, it generates a lot more shipping and packaging. There is some interesting stuff about how our use of cardboard has increased but our recycling of it has gone down. There’s a thing about people don’t understand they need to put it in the recycling bin. I’m just like…in my mind, cardboard is the easiest thing. Like plastic, I get confused about. It can vary by jurisdiction but…
AUDREY: Cardboards, easy.
CHRISTIE: Yeah. The waste management recycling companies are like, it was unexpected to them. There was a quote saying, “We expected newsprint to go down but the fact that there’s so much cardboard surprised us.” And then all the increased vehicle traffic. I didn’t know this but Amazon’s working on their own fleet of airplanes.
AUDREY: Yeah, I saw some things about that earlier this year. I still can’t decide whether I think that they’re going to try to buy a UPS or just replace them. I mean, they have been replacing some of their use of local delivery services that way.
CHRISTIE: Right. Let’s think about this. What are some of the big…so they bought Whole Foods. But yeah, I guess it could go either way. How big is UPS compared to Whole Foods?
AUDREY: I don’t know.
CHRISTIE: I want to do some research on this because I’m very curious.
AUDREY: Whole Foods has their own shipping, distribution, warehousing. I don’t know. It seems like trying to combine Amazon warehousing and Whole Foods warehousing and distribution is probably going to be a nightmare. I would hope they wouldn’t do that.
CHRISTIE: Where I was going with this was prior to buying Whole Foods, did Amazon try to make [inaudible] on their own in that space and I can’t remember exactly.
AUDREY: Well sure there’s been Amazon grocery services of various sorts for a while.
CHRISTIE: Have they? Okay. I don’t do any of that stuff.
AUDREY: Some of it hasn’t been available in Portland. But in Seattle and I think San Francisco may have seen some pilots of that. They’ve been doing like the Kozmo style grocery delivery stuff on and off for quite a while.
CHRISTIE: Kozmo style?
AUDREY: Kozmo was that delivery service in about 2000 that you could call them and they would bring you ice cream.
CHRISTIE: Okay.
AUDREY: Or like call them on the website.
CHRISTIE: I was poor and busy and in college. I think that one skipped me by.
AUDREY: I had friends who had finished college and entered the tech industry.
CHRISTIE: So yeah, that’s a good question. That’s very disturbing. It seems like they’re building out…
AUDREY: …trying to get into the grocery business for a while?
CHRISTIE: Well, no. I was thinking about them buying UPS because what I’ve seen happen to Whole Foods is I’ve seen the quality go down substantially and I don’t want that to happen to UPS. That’s what Amazon does. Whatever it swallows up, it then like further commoditizes and makes cheap and not very good…
AUDREY: If I were a financial business analyst, the specific thing that I would be trying to figure out is if Amazon Prime buys or replaces UPS in some way, no longer pays another company for the services that UPS provides to them, does that make Prime delivery profitable?
CHRISTIE: You mean in and of itself?
AUDREY: Yeah.
CHRISTIE: I don’t think Amazon cares.
AUDREY: I mean, because they always talk about them losing money off of it. Well, maybe not.
CHRISTIE: I think that is their whole magic is do a loss leader and then use that to take over things.
AUDREY: I’ve definitely seen some of the business folks say that they don’t think that that will work for them in the long term, and having some specific reasons that they think that.
CHRISTIE: I’d be curious about that because from reading this stuff about just how much more people with Amazon Prime spend, it seems to more than make up for the stuff. But what they might do is because they’ve slowly increased the price and then I think that they’ll just selectively make things ineligible for two day shipping or make a three day shipping. I think they’ll slowly sort of adjust the container to make their money models fit.
AUDREY: Have you noticed the rotation of incentives that they’ve offered to try to get you to pick slower or grouped shipping?
CHRISTIE: I usually ignore them because there seemed to be really piddly incentives like a dollar credit for a thing I’m not really using.
AUDREY: No, they have never made sense to me as incentives which I think is why it catches my eye. Like for a while, you could get a credit on a Prime pantry box. I think I’ve also seen video rental credit or just other Amazon services obviously. But that I haven’t seen a single consistent thing makes me think that there’s a team that’s just sitting there, experimenting trying to find something that people respond to.
CHRISTIE: Right. And I also wonder is that to save money on shipping or is that to get people interested in other products?
AUDREY: Right.
CHRISTIE: I don’t know.
AUDREY: And I certainly don’t have enough information to make a guess about that.
CHRISTIE: I’ve been thinking a lot about this just in general because there’s an environmental cost and already I am pretty good about bundling shipments although Amazon usually doesn’t give you control over that. But I do tend to not buy things one off but like, “Oh, I’ve got my list. I want to make an order now.” But I do think that enticement of free shipping and that you’re in the club so you get this thing, I think it can make you spend more than you ought to.
AUDREY: Or at least shift a large portion of your spending to Amazon.
CHRISTIE: There’s certain things we may be starting to buy on Amazon because we had Prime and because we’re already ordering other things. And then the other thing that I have experienced and the Reply All podcast to the episode about this is just it used to be that Amazon was a pretty trustworthy place to get stuff and you could look at the reviews and feel like you were getting a decent product. And lately, in the last year or so, I think that’s really dropped off. And Reply All does a really good job kind of going into that and they talk about [inaudible] Amazon started letting sellers outside of the United States sell direct on a platform and have to go through US distributor. And they also do things like they collapsed the sellers into one…[inaudible] a product all the sellers get collapsed into one so you actually don’t know who you’re buying from.
AUDREY: And I know that because I always tried to make sure that I’m buying a product from Amazon directly unless it’s something that I think is pretty low risk because there are so many knockoffs out there. And even if you can return it and get Amazon to ship you the thing you actually wanted, it’s just not worth the effort.
CHRISTIE: And it’s weird. I think you have more control over what vendor you’re buying from when you’re buying a book and other things. Because I do that along with used books. I’ll go through and I have a few favorite used book vendors I like to buy from even if they’re not necessarily the lowest price. But I feel like I almost never see that same interface when I’m just buying something else that’s not a book.
AUDREY: I noticed a specific thing the other day and like you, I’ve tried to cluster my Amazon spending and shipping. And I sit there and I figure out whether I can go buy it in person or not. But I noticed that there’s something where they try to give you the lowest price on the page even if it’s coming from another seller, and I think that they do add the shipping cost to it. But a couple of times, it’s been like a penny difference between some third party vendor and Amazon. And so I’ll look at it and go, “Well, it says that it’s being shipped by this other company. How did they get that? Is it not available from Amazon directly?” And I go and look and see that it’s just the price difference, some very small price difference that the other companies gained.
CHRISTIE: Yeah. And that’s part of what these scammers are doing is that they’re…Reply All interviewed someone who was like the original maker of a custom thing and they got in a war with these scammers that would take advantage of that listing collapse and they would price their knockoff product just under. He said there was one time where he just went back and forth with them for hours or days or whatever.
AUDREY: Yeah.
CHRISTIE: The other thing that’s happening is that you can no longer trust the verified purchaser reviews because there’s whole review farms for Amazon where they have addresses in the United States and they’re actually shipping product there and then writing reviews for people. So if you’ve ever gotten an Amazon package you didn’t order out of the blue, that’s probably what’s going on.
AUDREY: That, for me, was the funniest part of what they talked about on the podcast that people were getting products that they didn’t order because somebody was taking advantage of the verified review thing. It didn’t matter where it got shipped as long as it was US, so that they could then put in a review as a verified purchaser, not care about the product at all.
CHRISTIE: And Amazon is abdicating any responsibility in here. There was someone that actually got one of those electronic hover board things which I don’t understand why they’re called hover boards. And it caught on fire and caused her injury and she tried to sue Amazon and they didn’t go anywhere. So this is not just like, “Oh, I got a crappy knockoff.” So now, there’s some health and safety issues here.
AUDREY: Yeah, for sure.
CHRISTIE: And then we probably shouldn’t talk about now but that article you sent to me in the Atlantic about monopoly, I think, is really interesting. Maybe we would talk about it separately in another episode.
AUDREY: There’s some folks going after Amazon from a monopoly breaking perspective.
CHRISTIE: And the whole history of how in the 70’s, Bork and the Chicago School define the measure of is something a monopoly and is that bad is if…I mean, monopolies aren’t bad if they result in lower prices for the consumer. And like just how much that is a bad model for where we are now.
AUDREY: Right. And led to some things that we can see pretty explicitly.
CHRISTIE: Yeah, and especially if you think of Facebook. Facebook is free. We need new ways to evaluate monopoly power, other than price.
AUDREY: Yeah.
CHRISTIE: Okay. That was a lot.
AUDREY: We skipped last week. We had to catch up.
CHRISTIE: Yeah. I like this idea of we celebrate Prime Day by having a whole block criticizing Amazon.
AUDREY: Yeah. For me, the sales are sometimes beneficial but I did not use Amazon last week.
CHRISTIE: I’m not in the habit of criticizing anyone who use it because it’s a huge…it may be the only or best way that things people need are accessible to them. But I think it’s still good to talk about the impact and whatnot.
AUDREY: The downsides and the ways that we can maybe use our critique.
CHRISTIE: But also there’s some things that we love on the internet this week.
AUDREY: There are.
CHRISTIE: What have you got?
AUDREY: I read this article the other day about a scientist who has set herself nearly daily habit of writing a Wikipedia page for a female scientist.
CHRISTIE: Awesome.
AUDREY: And what I liked about it is…it’s just a fairly short interview. But what I liked about it that caught my attention immediately was that she starts off by criticizing current diversity in science approaches and the amount of money that’s spent on diversity efforts that do nothing. And she’s specifically said why aren’t we being scientific about how we increase the diversity of our field? And so she looked at what research was out there, came up with a specific approach that she could try, and is really actively doing it.
CHRISTIE: Awesome.
AUDREY: And there’s a book that she recommends in the interview that talks about this in a little bit more detail – the science of what we can do.
CHRISTIE: Awesome. Did you look at some of that Wikipedia pages she’s written?
AUDREY: I didn’t, no. Not yet.
CHRISTIE: Okay. Mine is also library related and I was like, “Wikipedia isn’t exactly a library.” So a lot of people were on Twitter talking about libraries this week because of that really stupid piece in…a not really thought out piece in Forbes about how we should just get rid of libraries because they cost taxpayers’ money. So I was really happy to see this. But our local library, the Multnomah County Library has launched this library music project. It’s LibraryMusicProject.com and they have all kinds of local artists available for streaming, and I just think that’s super cool.
AUDREY: Yeah, that’s great.
CHRISTIE: And I think anyone can stream a certain amount. But if you want to download or make playlists, you have to log in with your library card info. It’s probably Multnomah County and then whoever else they have reciprocal agreements with.
AUDREY: There’s actually a big partner agreement in our area. There’s a few different counties that all share access where you can walk in and get a card for another one of them if you’re in the area.
CHRISTIE: The original piece I read was really annoying but everyone’s response to it, just all the stories people shared about how they use libraries growing up, how they use them now, different really awesome things libraries do, how they really serve as these amazing community centers, all of that was really heartwarming. And then to sort of see this, I was like, “Yeah, libraries are totally keeping up with the times. This is great.”
AUDREY: They are.
CHRISTIE: Okey-dokey. I think that’s our show.
AUDREY: All right.
CHRISTIE: Thanks everyone for listening. Thanks, Audrey, for hosting with me again for another week.
AUDREY: Bye.
CHRISTIE: And that’s a wrap. You’ve been listening to The Recompiler Podcast. You can find this and all previous episodes at recompilermag.com/podcast. There you’ll find links to individual episodes as well as the show notes. You’ll also find links to subscribe to The Recompiler Podcast using iTunes or your favorite podcatcher. If you’re already subscribed via iTunes, please take a moment to leave us a review. It really helps us out. Speaking of which, we love your feedback. What do you like? What do you not like? What do you want to hear more of? Let us know. You can send email feedback to podcast@recompilermag.com or send feedback via Twitter to @RecompilerMag or directly to me, @Christi3k. You can also leave us an audio comment by calling 503 489 9083 and leave in a message.
The Recompiler podcast is a project of Recompiler Media, founded and led by Audrey Eschright and is hosted and produced by yours truly, Christie Koehler. Thanks for listening.