Transcript of SFP#7 Artificial intelligence as Free Software with Vincent Lequertier
This is a transcript created with the Free Software tool Whisper. For more information and feedback reach out to podcast@fsfe.org
WEBVTT 00:00.000 --> 00:18.080 Welcome to the Software Freedom Podcast. 00:18.080 --> 00:22.560 This podcast is presented to you by the Free Software Foundation Europe, where a charity 00:22.560 --> 00:25.480 that empowers users to control technology. 00:25.480 --> 00:29.020 I'm Matthias Kirchner, the President of the Free Software Foundation Europe, and I'm 00:29.020 --> 00:31.820 doing this podcast with my colleague, Bonnie Merring. 00:31.820 --> 00:33.060 Hello! 00:33.060 --> 00:36.980 In this episode, we will talk about artificial intelligence and free software, which 00:36.980 --> 00:41.620 for me is also a lot about the question, how do this will power between computers or machines 00:41.620 --> 00:43.900 and humans on the other side. 00:43.900 --> 00:46.020 Our guest for today is Versa Le Cartier. 00:46.020 --> 00:51.860 He is an active FSFE contributor in our French team, our system hackers team, and 00:51.860 --> 00:55.140 regularly also gives talks for the FSFE. 00:55.140 --> 01:00.060 In his state chart, he is a PhD student at the University of Claude Bernard, researching 01:00.060 --> 01:03.740 about artificial intelligence for healthcare systems. 01:03.740 --> 01:05.060 Hello Versa. 01:05.060 --> 01:06.060 Hello Bonnie. 01:06.060 --> 01:07.060 Hello, Matthias. 01:07.060 --> 01:08.060 Hello Versa. 01:08.060 --> 01:13.540 To just go ahead, Versa, when we talk about artificial intelligence, I automatically 01:13.540 --> 01:18.220 think of Hall from 2001, a Space Odyssey or a Samantha, the artificial intelligence 01:18.220 --> 01:20.100 from the movie Her. 01:20.100 --> 01:24.900 Is this how I should imagine artificial intelligence looks like? 01:24.900 --> 01:31.140 You know, I never got around to seeing the 2001 Space Odyssey movie, but I did watch 01:31.140 --> 01:32.140 the movie Her. 01:32.140 --> 01:38.100 I found it a bit creepy, but I don't think that we are anywhere close to making voice 01:38.100 --> 01:42.180 assistant with emotions and personality. 01:42.180 --> 01:45.540 But AI is much more than interactive robots. 01:45.540 --> 01:51.220 It encompasses a lot of different techniques, aiming at simulating, point-some-cases, 01:51.220 --> 01:54.660 surpassing human intelligence. 01:54.660 --> 02:01.660 It also includes chat bots, voice recognition, text translation, bots in video games, and 02:01.660 --> 02:02.660 so on. 02:02.660 --> 02:08.820 A formal definition of artificial intelligence may be any system that can learn how to 02:08.820 --> 02:12.500 perform a task based on observation. 02:12.500 --> 02:19.340 If I want to cite practical examples of AI, I might say things like Minecraft, the voice 02:19.340 --> 02:21.940 assistant, that is a free software. 02:22.500 --> 02:27.500 OK, so we do have artificial intelligence in our lives. 02:27.500 --> 02:35.980 Yes, whether we read it or not, AI is here for us, and it's a powerful technology that 02:35.980 --> 02:43.820 has been in our lives since maybe one decades or two. 02:43.820 --> 02:49.660 Vasa, in your presentations about your work with AI, your main demand there was in the 02:49.660 --> 02:55.740 past that artificial intelligence should be accessible, transparent, and fair. 02:55.740 --> 03:00.100 I think it would be very interesting for our listeners to dive more into those criteria 03:00.100 --> 03:02.700 and what you understand about that. 03:02.700 --> 03:06.580 Maybe we could start with the fairness part, Bonnie, you had some questions when we were 03:06.580 --> 03:07.580 preparing for this. 03:07.580 --> 03:09.740 Do you want to go ahead? 03:09.740 --> 03:10.740 Yes, please. 03:10.740 --> 03:15.940 Vasa, I was wondering, what does fairness mean for an AI? 03:15.940 --> 03:21.820 Would it be seen as unfair if an AI does not follow the laws of a society like the law 03:21.820 --> 03:26.540 to not discriminate any people no matter of the race, sex, or gender? 03:26.540 --> 03:34.460 So yes, if I want to define fairness for artificial intelligence, fairness will mean the equality 03:34.460 --> 03:41.140 of treatments for everyone for the less of things that you don't want to include in 03:41.140 --> 03:43.620 your prediction models. 03:43.620 --> 03:48.180 For example, you might want to have a fair artificial intelligence that do not take into 03:48.180 --> 03:57.260 account your gender or your race or your religion or your age or any kind of sensitive attributes. 03:57.260 --> 04:02.420 Do you have an example of how an AI could discriminate someone? 04:02.420 --> 04:05.580 Yes, so I have a couple of examples. 04:05.580 --> 04:09.540 That was the case of racial bias in healthcare a couple of years ago. 04:10.100 --> 04:15.500 This has been reported in a research article whose title is Disacting Racial Bias in 04:15.500 --> 04:19.460 an Algorithm used to manage the health of population. 04:19.460 --> 04:25.500 And in this article, the authors found that widely used algorithm used to assess the risk 04:25.500 --> 04:32.260 of health issues, so the health issues of people had racial bias. 04:32.260 --> 04:38.180 And this algorithm is used to identify high-risk patients, which get more care resources 04:38.180 --> 04:41.660 and attention from the hospital staff. 04:41.660 --> 04:46.780 But unfortunately, the issue with this algorithm is that to get the same risk score as white 04:46.780 --> 04:51.180 people, black people had to be much more sick. 04:51.180 --> 04:57.260 And this is presumably caused by raising the risk estimation on the health of the people, 04:57.260 --> 05:01.060 but also on the estimated health care cost. 05:01.060 --> 05:06.940 So as you can see, AI bias can have important real-world consequences. 05:06.940 --> 05:12.980 And I can give you another example this time in the US justice system. 05:12.980 --> 05:19.020 There is a proprietary software called the Compass used to tell how likely someone is going 05:19.020 --> 05:21.620 to receive the data in their client. 05:21.620 --> 05:27.180 An analysis by Kopelbika revealed that the algorithm was racist. 05:27.180 --> 05:33.860 It turned out that compared to white people, black people at a much higher risk of being 05:33.860 --> 05:41.340 falsely considered as risky criminals that are going to commit their crimes again. 05:41.340 --> 05:47.900 So in other words, the algorithm told that black people were much more dangerous for societies 05:47.900 --> 05:49.940 than white people. 05:49.940 --> 05:56.580 And conversely, white people were often misclassified as low-risk difference, which means unlikely 05:56.580 --> 05:59.060 to receive their crimes. 05:59.060 --> 06:05.340 So the false positive rate was much higher for black people compared to white people and 06:05.340 --> 06:07.900 so reverse for the false negative rate. 06:07.900 --> 06:13.620 Again, this shows that unfair algorithms exist in the wild and that they are using critical 06:13.620 --> 06:15.020 cases. 06:15.020 --> 06:18.500 And on top of that, those two algorithms aren't free software. 06:18.500 --> 06:24.740 OK, before I go over to my next question, could you shortly describe what false positive 06:24.740 --> 06:27.100 and false negative means? 06:27.660 --> 06:34.220 Yes, so to explain false positive and false negative and true positive and two negatives, 06:34.220 --> 06:38.980 I will give you an example based on the spam detection. 06:38.980 --> 06:45.540 So the spam are emails you don't want to see and to tackle spam. 06:45.540 --> 06:51.620 There are some software that is used to classify whether an email is a spam or a legitimate 06:51.620 --> 06:54.060 email. 06:54.060 --> 07:00.500 So if you get a message and it's completely legitimate email, but the software classifies 07:00.500 --> 07:06.860 it as spam, it will be called a false positive because the software thought that the email 07:06.860 --> 07:09.540 was a spam, but it wasn't. 07:09.540 --> 07:17.940 If the email was in fact a spam, but the software thought it was completely legitimate, it will 07:17.940 --> 07:24.900 be called a false negative because the software thought that the email wasn't a spam. 07:24.900 --> 07:30.300 And so the true positive and the true negatives are correct classifications, meaning that the 07:30.300 --> 07:36.420 software correctly classified the emails as spam or legitimate email. 07:36.420 --> 07:43.580 So this is an example that can be used to explain this concept. 07:43.580 --> 07:46.700 Vasa, I also have a question about what you just said. 07:46.700 --> 07:51.580 So I mean you said that sometimes there are mistakes that happen. 07:51.580 --> 07:57.900 But I mean when we look back in history of a humankind, there were a lot of occasions 07:57.900 --> 08:02.580 when humans on purpose discriminated certain groups. 08:02.580 --> 08:08.420 And a lot of that was also done on purpose with architecture, with technical means, like 08:08.420 --> 08:13.060 for example in Lauren Slasik's book with whom we also talked about regulation before 08:13.060 --> 08:20.500 in one of the podcasts, there's an example of how bridges and train lines were used to 08:20.500 --> 08:27.180 make it harder for certain minorities to go to other parts of a city and get better 08:27.180 --> 08:28.380 jobs. 08:28.380 --> 08:34.940 How can we find out if something is done by mistake or if that's on purpose when you 08:34.940 --> 08:38.500 have an AI involved? 08:38.500 --> 08:45.620 So if you have only the new result of the AI, I mean if you have only the predictions, 08:45.620 --> 08:51.420 then you cannot really know the intent like the is the purpose of the predictions. 08:51.420 --> 08:53.140 What you need is a source code. 08:53.140 --> 09:00.700 So free software will help you to know the purpose behind the predictions because you know 09:00.700 --> 09:09.900 what the input of the AI was and you can also know what where the design behind the prediction 09:09.900 --> 09:10.900 model. 09:10.900 --> 09:17.060 So you can guess how the data was processed and how the algorithm was used and that way 09:17.060 --> 09:24.260 you know the purpose of the AI and also you can know how the model was evaluated. 09:24.260 --> 09:31.020 I mean what metric was used to evaluate the performance of the artificial intelligence. 09:31.020 --> 09:38.100 So you can know for example if the true positive rate was the same whether the person was 09:38.100 --> 09:46.420 a male or a female or black or white or whatever and if you can do this kind of test then 09:46.420 --> 09:50.940 you can see if the AI was there. 09:50.940 --> 09:58.820 So by seeing this source code and with transparency then you can guess the purpose. 09:58.820 --> 10:05.780 So this part is now about the demand for transparency you talk about in your presentations, right? 10:05.780 --> 10:13.620 Yeah, I think that the result connection between how transparent an algorithm is and 10:13.620 --> 10:22.180 how fair can be because much like when we talk about security and free software I think 10:22.180 --> 10:28.660 that we need transparency for algorithm to ensure that they are fair. 10:28.660 --> 10:34.100 If you cannot see the source code of AI and if it's not transparent then you cannot 10:34.100 --> 10:36.300 ensure that it will be fair. 10:36.300 --> 10:41.780 Much like you cannot really be sure about the security of the software if you cannot 10:41.780 --> 10:44.340 see the source code of it. 10:44.340 --> 10:49.580 Is it the case that if you have the source code of the AI that this would be sufficient 10:49.580 --> 10:55.420 to understand how it's actually working or do you also need a lot of the training data 10:55.420 --> 11:00.100 or other data where the AI learned from? 11:00.100 --> 11:06.140 I think that to answer your question to really understand the AI you will need three things 11:06.140 --> 11:07.140 basically. 11:07.140 --> 11:13.900 You will need the data that was used to train the AI or if the data is really sensitive 11:13.900 --> 11:20.060 they can you cannot access to it you can have the its characteristics. 11:20.060 --> 11:27.740 So what were the variables and what were their distribution how did they look like? 11:27.740 --> 11:31.860 Then you need to know how the AI was trained. 11:31.860 --> 11:35.860 So what was the source code used to train the AI? 11:35.860 --> 11:40.700 And then you need to be able to evaluate the AI. 11:40.700 --> 11:49.220 You need to have some kind of metric that tells if the AI was accurate and if the accuracy 11:49.220 --> 11:56.820 was the same regardless of some kind of attribute such as your age or gender or any kind 11:56.820 --> 11:59.860 of protected attribute. 11:59.860 --> 12:07.540 From my understanding one thing that AI is able to do is to very quickly adapt and learn 12:07.540 --> 12:10.060 way, way faster than humans are. 12:10.060 --> 12:15.940 So when we are now talking about source code is it correct that that means that in one 12:15.940 --> 12:22.660 time like one minute it is that source code and a few minutes later it's completely different 12:22.660 --> 12:28.580 and the AI might act on different rules or how should I mention that? 12:28.580 --> 12:36.140 So I don't really think that AI learns faster than humans do because I mean if you like 12:36.140 --> 12:43.460 show 10 pictures of cats to a two years old he or she will be able to you know recognize 12:43.460 --> 12:50.860 cats or any kind of animal but for AI you need to put through the algorithm like millions 12:50.860 --> 12:55.980 or billions of images for it to grow any kind of subject. 12:55.980 --> 13:01.180 So I don't think this is generally true that the algorithm is faster. 13:01.180 --> 13:05.940 It just appears to be because we have a lot of computational power so we can use a lot 13:05.940 --> 13:12.300 of computation to train algorithms for days and days and days in data centers. 13:12.300 --> 13:18.620 So for an AI to work you have to train it with the right data, with the right training 13:18.620 --> 13:25.220 code and evaluate its performance in a good way that measure how fair it is and after 13:25.260 --> 13:30.100 you have to monitor its accuracy through this series. 13:30.100 --> 13:38.780 You have to check if the furnace of the algorithm stays the same and if the AI furnace drops 13:38.780 --> 13:46.420 you have to like stop using it and you have to detect it and then retrain your AI with 13:46.420 --> 13:53.380 new data or with a new source code to make sure that the furnace is good. 13:53.380 --> 13:59.580 So yeah, this is the source code of the AI, this change and let's be checked. 13:59.580 --> 14:07.220 So that means that the AI itself would also have to be set up in a way that it's documenting 14:07.220 --> 14:11.540 itself in a way that humans understand that. 14:11.540 --> 14:14.340 Do I understand it correct, Vasa? 14:14.340 --> 14:20.820 Yes, so what you need is to make sure that the AI can give you some kind of metrics or 14:20.820 --> 14:29.540 furnace regularly, like each day you measure the furnace core so that you can have some 14:29.540 --> 14:37.220 kind of measure and you can detect the automatism in the furnace. 14:37.220 --> 14:42.340 I do have a basic question here because you have already mentioned the training data 14:42.340 --> 14:49.060 for an artificial intelligence, who actually trains an artificial intelligence, how should 14:49.060 --> 14:50.900 I imagine that the data looks like? 14:50.900 --> 14:58.140 For example, if you take Alexa, one of the examples you gave at the beginning for an AI, 14:58.140 --> 15:04.500 who trains Alexa, would this be Amazon or is it a person at home? 15:04.500 --> 15:10.100 So an AI is trained both with data and resource code. 15:10.100 --> 15:16.900 Basically, when you are using Alexa or any kind of like a voice recording device, you create 15:16.900 --> 15:20.980 data that is used to train Alexa again. 15:20.980 --> 15:27.660 So you participate in the training of the AI because your data is used, but the training 15:27.660 --> 15:34.180 code is done by Amazon, so because Alexa is proprietary, we can really only guess what 15:34.180 --> 15:41.600 is happening there, but I guess that its data scientist is a trained AI, often with 15:41.600 --> 15:42.600 research tools. 15:42.600 --> 15:54.120 AI is developed a lot with open source software and it's done inside companies by data scientist. 15:54.120 --> 15:59.120 I can imagine when you have lots of data and you have to train such an AI that also means 15:59.120 --> 16:05.560 that you need a lot of processing power from the AI you deal with, is it something that 16:05.560 --> 16:10.640 you can actually run on your computer or do people have to imagine that more like you 16:10.640 --> 16:17.240 need huge data centers to train an AI or how do you have to think about that? 16:17.240 --> 16:21.960 So it depends on what you want and also it depends on the AI itself. 16:21.960 --> 16:28.960 If you want to like reproduce the state of the art, I mean the paper that just was published 16:28.960 --> 16:34.840 last month and will produce all their results, well you can see if you don't have like giant 16:34.840 --> 16:42.040 data centers with entire teams that monitor like computers and stuff, so you need a lot 16:42.040 --> 16:46.560 of money and computing power to do that. 16:46.560 --> 16:52.160 Because you need to train your AI for a lot of time and with a lot of data, by a lot 16:52.160 --> 16:57.920 of data I mean like gigabytes or terabytes of data. 16:57.920 --> 17:04.680 But thankfully you can still with your home computer, I mean with your laptop, you can 17:04.680 --> 17:11.160 still get good results if you have like more modest intent. 17:11.160 --> 17:18.640 Because of improvements in the hardware, like with GPUs, I mean graphical processing 17:18.640 --> 17:25.560 units, getting cheaper and cheaper, you can have powerful machines at home and you can 17:25.560 --> 17:28.480 use them to train some AI. 17:28.480 --> 17:31.360 And that's also possible because of free software. 17:31.360 --> 17:37.560 Because free software is available to you, you can use it yourself and so you can you 17:37.560 --> 17:43.680 can train it on your personal computer and it will work, it will work also because you 17:43.680 --> 17:47.840 can leverage already trained model. 17:47.840 --> 17:55.600 What you can do is to take the already existing models and incorporate them inside your 17:55.600 --> 17:56.600 world. 17:56.600 --> 18:04.760 You can take the state of the art model and just train some part of it to report for your 18:04.760 --> 18:05.760 needs. 18:05.760 --> 18:14.000 So I think that it's a very powerful technique and that makes you able to use AI with your 18:14.000 --> 18:19.840 simple, I mean basic computers and still have amazing results. 18:19.840 --> 18:24.640 So you kind of use pre-trained AI and continue with this. 18:24.640 --> 18:33.360 Yes, you can use retrained AI for a lot of command tasks such as image classification 18:33.360 --> 18:42.000 or like for example an LP model, like natural language processing models that have gathered 18:42.000 --> 18:49.040 a lot of knowledge about language and you can take these giant big models and you can 18:49.040 --> 18:53.720 use them as part of your own model. 18:53.720 --> 18:59.880 For example, there is a very large AI competition that is called ImageNet. 18:59.880 --> 19:09.080 In this competition, you have to classify I think 10,000 different categories of dogs 19:09.080 --> 19:12.920 or animals or objects or things. 19:12.920 --> 19:17.080 So you have 10,000 different things to classify. 19:17.080 --> 19:22.960 And this is a competition done by researchers or scientists. 19:22.960 --> 19:30.060 And so the winner, so the model that is the most accurate at doing that is often released 19:30.060 --> 19:32.560 publicly as free software. 19:32.560 --> 19:38.680 So what you can do if like let's say that you want to classify between two different 19:38.680 --> 19:41.960 things like cats, the three dogs. 19:41.960 --> 19:45.800 For example, you have images of cats and images of dogs. 19:45.800 --> 19:52.520 What you can do instead of starting from scratch is to take these big models and repose 19:52.520 --> 19:54.640 it for your needs. 19:54.640 --> 20:03.360 So you can train only part of it and reduce the 10,000 classification levels to only cats 20:03.360 --> 20:04.960 and dogs. 20:04.960 --> 20:11.320 And that will be much more fast and efficient than starting from scratch. 20:11.320 --> 20:15.040 So I think we are now already partly in the accessibility part. 20:15.040 --> 20:19.200 So I mean, we talked about the fairness, we talked about transparency. 20:19.200 --> 20:24.880 Now with the accessibility, I mentioned, I mean, one part is that the tools are free software. 20:24.880 --> 20:30.000 So you can use them for any purpose that you can understand how they work that you can 20:30.000 --> 20:33.720 share them with others and that you can make modifications. 20:33.720 --> 20:38.400 Is there anything else which is necessary for AI's that they are accessible? 20:39.040 --> 20:39.840 Yes. 20:39.840 --> 20:48.280 So what you need is a powerful hardware, but thankfully, as I said, powerful hardware is getting 20:48.280 --> 20:50.160 cheaper every day. 20:50.160 --> 20:57.480 So you can have accessible hardware that you can use to like train your own artificial 20:57.480 --> 20:59.760 intelligence. 20:59.760 --> 21:07.960 But unfortunately, the drivers for this graphical processing unit comes are proprietary. 21:07.960 --> 21:14.040 Like that, I mean that the software that is used to make your card communicate to your 21:14.040 --> 21:22.880 computer is proprietary, that prevents AI from being fully accessible, unfortunately. 21:22.880 --> 21:31.440 So it makes AI training with software much more complicated that it should be. 21:31.440 --> 21:35.520 So maybe we're sad to summarize it a bit to this point. 21:35.520 --> 21:41.600 So for fairness, what do we need that you haven't fair AI? 21:41.600 --> 21:46.640 So you need to be able to measure the fairness of the AI. 21:46.640 --> 21:52.320 You need to evaluate how fair it is with some kind of score. 21:52.320 --> 21:58.840 And then you need to be able to monitor this score to make sure that it stays the same. 21:58.840 --> 22:05.280 And then you need to make sure that this score has been well established because I mean, 22:05.280 --> 22:08.360 there are multiple definitions of fairness. 22:08.360 --> 22:11.400 And so you can leverage it in different ways. 22:11.400 --> 22:17.720 So you have to agree with all stakeholders to make sure that your fairness definition 22:17.720 --> 22:21.680 is good considering your problem attend. 22:21.680 --> 22:28.880 And then you need to, as I said, monitor the fairness of the software. 22:28.880 --> 22:33.840 Could you also summarize transparency and accessibility for us? 22:33.840 --> 22:34.840 Yes. 22:35.000 --> 22:43.600 Transparency of AI means adding access to the data that was used to train the algorithm. 22:43.600 --> 22:49.760 Or at least be able to know the characteristics of the input data. 22:49.760 --> 22:54.320 Then you need to have access to this whole score of the AI. 22:54.320 --> 23:01.600 And then you need to define a metric that is used to tell if the model is accurate. 23:01.600 --> 23:08.360 And also if it's accurate for every values of a protected attribute. 23:08.360 --> 23:15.080 And then you need to make sure that everything is released as a free software. 23:15.080 --> 23:22.720 And also what is great with regard to transparency is that recently, 23:22.720 --> 23:28.960 with the free software foundation Europe, what we want to do is to have open science. 23:28.960 --> 23:35.320 So open science means to have science accessible to all and to consider software 23:35.320 --> 23:38.480 as a result of the research. 23:38.480 --> 23:44.440 As a citizen, you should be able to have access to the data that was used to the research 23:44.440 --> 23:46.600 and also to its source code. 23:46.600 --> 23:51.560 And all of that was used to create an AI. 23:51.560 --> 23:59.800 And so with these two things, you are able to have access to the artificial intelligence 23:59.800 --> 24:02.040 and to make it transparent. 24:02.040 --> 24:11.720 So to summarize the accessibility point, what you need is to be able to train the AI yourself. 24:11.720 --> 24:15.320 We need to have free software to train AI. 24:15.320 --> 24:22.040 So we need to have full frameworks and methods to train artificial intelligence. 24:22.040 --> 24:28.400 We need also to have cheap and reliable hardware to train artificial intelligence. 24:28.400 --> 24:36.000 And you need to have free drivers to be able to control these GPUs. 24:36.000 --> 24:39.560 Is there any AI out there which implements those three criteria? 24:39.560 --> 24:43.800 So do we have any positive examples there? 24:43.840 --> 24:51.880 So yeah, unfortunately, I don't know any kind of AI that is like for accessible 24:51.880 --> 24:54.280 and transparent at the same time. 24:54.280 --> 25:01.480 And I think it's really bad and we can do much better with regard to these three things. 25:01.480 --> 25:04.760 So yeah, no AI is perfect yet. 25:04.760 --> 25:09.640 Do you know of any upcoming legislations in Europe that are planning on implementing 25:09.640 --> 25:12.640 those three criteria for an AI? 25:12.640 --> 25:14.240 No, unfortunately not. 25:14.240 --> 25:21.600 I'm not aware of any kind of legislation that is ongoing, but fortunately it's a result 25:21.600 --> 25:26.480 because the European Commission released a white paper in February. 25:26.480 --> 25:32.640 Its title is on artificial intelligence, European approach to excellence and trust, 25:32.640 --> 25:35.760 which talks about AI transparency. 25:35.760 --> 25:40.320 And it demands that the data about the data used to train models 25:40.320 --> 25:45.760 and how their accuracy is measured is provided to everyone. 25:45.760 --> 25:51.840 So this is not a legislation, but I think it's an 8.20 right direction. 25:51.840 --> 25:52.880 So there's hope. 25:52.880 --> 25:56.800 Vasa, to wrap it up, what are the biggest challenges you see for free software 25:56.800 --> 26:02.080 in the field of artificial intelligence at the moment? 26:02.080 --> 26:06.240 So I think that artificial intelligence is really powerful. 26:06.240 --> 26:09.120 I mean, we have met a lot of progress. 26:09.120 --> 26:13.760 And it's like in some regard, AI is much better than humans. 26:13.760 --> 26:19.200 Like it can run for hours without any kind of concentration issues. 26:19.200 --> 26:24.160 I mean, it never gets bored and it has a consistent behavior. 26:24.160 --> 26:28.640 And you know, it can remember a lot of information. 26:28.640 --> 26:34.400 So I think that for these points, AI has a lot of advantages 26:34.400 --> 26:37.120 over their models. 26:37.120 --> 26:41.360 But I think that, yeah, AI can be leveraged to improve society. 26:41.360 --> 26:46.000 But I'm afraid of AI for a couple of reasons. 26:46.000 --> 26:50.000 I think that the first one would be aggressive behavior. 26:50.000 --> 26:54.800 So for example, AI systems are employed to filter out, you know, 26:54.800 --> 26:59.440 helpful content or to detect copyright infringement. 26:59.440 --> 27:02.560 And it's done in a non-tomated way. 27:02.560 --> 27:05.280 And with limited human oversight. 27:05.280 --> 27:08.960 And more specifically, for example, YouTube use AI 27:08.960 --> 27:12.400 to detect unauthorized use of copyright materials. 27:12.400 --> 27:14.320 But sometimes it gets things wrong. 27:14.320 --> 27:18.480 And it doesn't understand things like priorities or means 27:18.480 --> 27:21.360 or more generally they're used. 27:21.360 --> 27:27.040 I think that being able to test AI and measure its furnace 27:27.040 --> 27:31.120 and be able to detect when it gets things wrong 27:31.200 --> 27:34.000 is one big challenge for a furnace. 27:34.000 --> 27:37.200 One point I'm also thinking a little bit about is 27:37.200 --> 27:40.960 when people or companies say, well, we don't know 27:40.960 --> 27:44.880 why this was the result of our software. 27:44.880 --> 27:47.680 It's so complex, we cannot understand it anymore. 27:47.680 --> 27:51.600 So we're sorry about that, but it was the AI. 27:51.600 --> 27:55.520 So when people say something like that, do you think that's true? 27:55.520 --> 27:58.640 Or do you think that this is something they 27:58.640 --> 28:00.000 rather use as an apology? 28:02.720 --> 28:06.000 So I think that the decade I grew it was true 28:06.000 --> 28:10.560 because we weren't able to really understand the AI. 28:10.560 --> 28:14.720 I mean AI can sometimes give a lot of good predictions. 28:14.720 --> 28:16.640 But we are not able to interpret it. 28:16.640 --> 28:20.000 Because the neural networks and the technologies 28:20.000 --> 28:23.120 used to make predictions are so complex 28:23.120 --> 28:27.600 that we are not able to interpret the results. 28:27.680 --> 28:33.840 In a way that we aren't able to connect the input to the output. 28:33.840 --> 28:38.400 I mean, how we are able to know what in the input 28:38.400 --> 28:40.240 led to the prediction. 28:41.040 --> 28:43.360 But I think that we are getting better at this. 28:43.360 --> 28:49.440 And we are researching ways to interpret the results of the AI. 28:49.440 --> 28:55.280 So if companies or people want to not to take responsibility for that, 28:55.280 --> 28:58.880 it's probably rather that maybe they don't know 28:58.880 --> 29:03.600 at the moment why certain decisions are happening like that. 29:03.600 --> 29:06.960 But they also maybe don't want to know at the moment. 29:06.960 --> 29:08.480 Because if they would like to know, 29:08.480 --> 29:10.320 they would have the means to find out 29:10.320 --> 29:12.640 why certain decisions are made by the AI. 29:15.600 --> 29:20.000 Yes, yes, but I think that it boils down to too many. 29:20.640 --> 29:24.560 I think that being able to produce a system 29:24.560 --> 29:28.240 that is interpretable costs a lot of money. 29:28.240 --> 29:29.760 And it takes a lot of time. 29:29.760 --> 29:35.120 And so you need to be able to spend money 29:35.120 --> 29:38.240 to create powerful AI that are well designed, 29:39.120 --> 29:42.720 that are transparent, that are fair, accessible, 29:42.720 --> 29:45.360 and that you are able to interpret. 29:46.160 --> 29:51.840 So I think that one issue with this is time and money. 29:52.560 --> 29:55.280 If you now think about what we talked 29:55.280 --> 29:59.200 and maybe also about how AI without free software 29:59.200 --> 30:01.440 could shape and control our future, 30:01.440 --> 30:05.040 are you then afraid of the increasing usage of artificial intelligence 30:05.040 --> 30:06.160 in our society? 30:07.360 --> 30:11.360 I think that with this issue with our full AI 30:11.360 --> 30:16.240 that are perpetually and that don't have any kind of human oversight. 30:16.960 --> 30:21.360 So with the danger, because as I gave examples earlier, 30:21.440 --> 30:26.320 artificial intelligence has a lot of consequences in our world. 30:26.320 --> 30:31.520 And sometimes it's good, but sometimes it's leads to mistakes 30:31.520 --> 30:34.480 or things that we don't want to see. 30:35.280 --> 30:39.200 And I think that it's a bit scary, to be honest, 30:39.200 --> 30:43.760 to have these systems that we aren't able to access 30:43.760 --> 30:48.240 and we aren't able to inspect because they are appropriate. 30:49.200 --> 30:52.160 And also I'm a bit scared about AI 30:52.160 --> 30:54.480 because of its impact on the environment. 30:55.200 --> 30:59.200 Because a lot of jobs will be replaced with AI at some point. 30:59.760 --> 31:02.640 And I hope that we will find a way to not put people 31:02.640 --> 31:05.040 whose jobs might become irrelevant 31:05.040 --> 31:06.560 in an embarrassing situation. 31:08.400 --> 31:11.200 And how about an AI that would be free software? 31:11.200 --> 31:12.320 Would you then be afraid? 31:14.240 --> 31:17.680 A bit less, because with free software, 31:18.240 --> 31:21.600 we are able to inspect how the AI works. 31:22.240 --> 31:25.760 And so we are able to take a lot of issues. 31:25.760 --> 31:28.240 We are with a proprietary AI. 31:28.960 --> 31:32.640 And with this, we can visual how accurate it is, 31:32.640 --> 31:34.080 how fair it is. 31:34.080 --> 31:36.960 And I think that it should be mandatory 31:36.960 --> 31:41.680 and it's a much less scary to have AI that are open and accessible. 31:41.680 --> 31:44.880 So Vesa, unfortunately, we are coming to the end. 31:44.880 --> 31:49.520 So I think this topic is a big challenge for human freedoms. 31:50.000 --> 31:53.920 And I'm not sure yet how exactly AI should look in future. 31:54.720 --> 31:57.120 I think on the way there, we will learn a lot 31:57.120 --> 32:00.080 and also make some good and some bad experience. 32:00.080 --> 32:03.120 But in general, the idea you're promoting 32:03.120 --> 32:05.440 that supporting people building AI 32:05.440 --> 32:08.000 that is accessible, transparent and fair 32:08.000 --> 32:10.400 seems like a good first step for humankind. 32:11.360 --> 32:14.880 Even if that process might then sometimes be slower 32:14.880 --> 32:17.040 if you don't apply those criteria. 32:17.760 --> 32:21.600 So thank you already very much for talking with us about AI. 32:22.320 --> 32:26.000 In our podcast, we always, at the end, have one question. 32:26.720 --> 32:29.760 And I would also like to ask that to you. 32:29.760 --> 32:34.160 So as our regular visitors know, on the 14th of February, 32:34.160 --> 32:36.720 we always celebrate the I love free software day 32:37.440 --> 32:41.760 so that not just the flower industry benefits from this day. 32:41.760 --> 32:45.120 And we use this day to thank free software developers 32:45.120 --> 32:47.440 and communities out there for the effort and work 32:47.440 --> 32:49.840 to making our society a better place to live. 32:50.480 --> 32:52.720 But of course, the 14th of September 32:52.720 --> 32:55.360 shouldn't be the only day where you thank people 32:55.360 --> 32:57.200 for their work for free software. 32:57.200 --> 32:58.400 I wanted to ask you the question, 32:58.400 --> 32:59.920 is there any software out there 32:59.920 --> 33:04.240 or any developer out there whom you would like to thank or to mention? 33:07.680 --> 33:10.320 Yes, so I'd like to mention a few software. 33:10.960 --> 33:13.680 So I want to thank Perras, 33:13.680 --> 33:16.800 the Artificial Intelligence Framework. 33:16.800 --> 33:19.920 So it's a software that is used to build 33:19.920 --> 33:21.680 Artificial Intelligence very easily. 33:22.240 --> 33:26.880 And I'm also very grateful for the by-dodge 33:26.880 --> 33:27.760 developerism. 33:27.760 --> 33:30.320 I think it's a project for me from Facebook. 33:30.320 --> 33:34.800 And also to the TensorFlow software done by Google. 33:34.880 --> 33:36.960 And I'm deeply thankful for this 33:36.960 --> 33:40.160 because I'm based in my PhD project on those software. 33:40.800 --> 33:42.720 And so far, it's been working great. 33:43.520 --> 33:47.920 And I'm also really thankful for the modular community 33:47.920 --> 33:50.720 for developing the Firefox web browser 33:50.720 --> 33:52.960 because it's a web browser that I already like. 33:53.520 --> 33:55.280 Because it's free software. 33:55.280 --> 33:57.520 It respects your privacy. 33:57.520 --> 33:58.720 It's powerful. 33:58.720 --> 33:59.840 It's fast. 33:59.840 --> 34:00.240 So yeah. 34:00.240 --> 34:00.880 Thank you, Vasa. 34:03.440 --> 34:03.920 You're welcome. 34:04.960 --> 34:06.080 Thank you, Vasa. 34:06.080 --> 34:08.640 We're talking with us about Artificial Intelligence 34:08.640 --> 34:09.440 and Free Software. 34:10.320 --> 34:12.480 This was the software Freedom Podcast. 34:12.480 --> 34:15.840 If you liked this episode, please recommend it to your friends 34:15.840 --> 34:16.560 and rate it. 34:17.200 --> 34:20.320 Also subscribe to make sure you will get the next episode. 34:20.960 --> 34:24.400 This podcast is presented to you by the Free Software Foundation Europe, 34:24.400 --> 34:27.440 where a charity that works on promoting software freedom. 34:27.440 --> 34:30.880 If you like our work, please consider supporting us with a donation. 34:30.880 --> 34:33.440 You'll find more information under fsafety.org, 34:33.520 --> 34:34.560 slash the need. 34:34.560 --> 34:35.600 Thank you very much. 34:35.600 --> 34:36.800 Thank you very much, Vasa.