Say no to locked-down devices that limit our freedom to install apps and switch operating systems. Say yes to device neutrality, which ensures that we control our own hardware! Your device, your choice! Support our demand for the right to install any software on our devices.

Transcript of SFP#7 Artificial intelligence as Free Software with Vincent Lequertier

Back to the episode SFP#7

This is a transcript created with the Free Software tool Whisper. For more information and feedback reach out to podcast@fsfe.org

WEBVTT

00:00.000 --> 00:18.080
Welcome to the Software Freedom Podcast.

00:18.080 --> 00:22.560
This podcast is presented to you by the Free Software Foundation Europe, where a charity

00:22.560 --> 00:25.480
that empowers users to control technology.

00:25.480 --> 00:29.020
I'm Matthias Kirchner, the President of the Free Software Foundation Europe, and I'm

00:29.020 --> 00:31.820
doing this podcast with my colleague, Bonnie Merring.

00:31.820 --> 00:33.060
Hello!

00:33.060 --> 00:36.980
In this episode, we will talk about artificial intelligence and free software, which

00:36.980 --> 00:41.620
for me is also a lot about the question, how do this will power between computers or machines

00:41.620 --> 00:43.900
and humans on the other side.

00:43.900 --> 00:46.020
Our guest for today is Versa Le Cartier.

00:46.020 --> 00:51.860
He is an active FSFE contributor in our French team, our system hackers team, and

00:51.860 --> 00:55.140
regularly also gives talks for the FSFE.

00:55.140 --> 01:00.060
In his state chart, he is a PhD student at the University of Claude Bernard, researching

01:00.060 --> 01:03.740
about artificial intelligence for healthcare systems.

01:03.740 --> 01:05.060
Hello Versa.

01:05.060 --> 01:06.060
Hello Bonnie.

01:06.060 --> 01:07.060
Hello, Matthias.

01:07.060 --> 01:08.060
Hello Versa.

01:08.060 --> 01:13.540
To just go ahead, Versa, when we talk about artificial intelligence, I automatically

01:13.540 --> 01:18.220
think of Hall from 2001, a Space Odyssey or a Samantha, the artificial intelligence

01:18.220 --> 01:20.100
from the movie Her.

01:20.100 --> 01:24.900
Is this how I should imagine artificial intelligence looks like?

01:24.900 --> 01:31.140
You know, I never got around to seeing the 2001 Space Odyssey movie, but I did watch

01:31.140 --> 01:32.140
the movie Her.

01:32.140 --> 01:38.100
I found it a bit creepy, but I don't think that we are anywhere close to making voice

01:38.100 --> 01:42.180
assistant with emotions and personality.

01:42.180 --> 01:45.540
But AI is much more than interactive robots.

01:45.540 --> 01:51.220
It encompasses a lot of different techniques, aiming at simulating, point-some-cases,

01:51.220 --> 01:54.660
surpassing human intelligence.

01:54.660 --> 02:01.660
It also includes chat bots, voice recognition, text translation, bots in video games, and

02:01.660 --> 02:02.660
so on.

02:02.660 --> 02:08.820
A formal definition of artificial intelligence may be any system that can learn how to

02:08.820 --> 02:12.500
perform a task based on observation.

02:12.500 --> 02:19.340
If I want to cite practical examples of AI, I might say things like Minecraft, the voice

02:19.340 --> 02:21.940
assistant, that is a free software.

02:22.500 --> 02:27.500
OK, so we do have artificial intelligence in our lives.

02:27.500 --> 02:35.980
Yes, whether we read it or not, AI is here for us, and it's a powerful technology that

02:35.980 --> 02:43.820
has been in our lives since maybe one decades or two.

02:43.820 --> 02:49.660
Vasa, in your presentations about your work with AI, your main demand there was in the

02:49.660 --> 02:55.740
past that artificial intelligence should be accessible, transparent, and fair.

02:55.740 --> 03:00.100
I think it would be very interesting for our listeners to dive more into those criteria

03:00.100 --> 03:02.700
and what you understand about that.

03:02.700 --> 03:06.580
Maybe we could start with the fairness part, Bonnie, you had some questions when we were

03:06.580 --> 03:07.580
preparing for this.

03:07.580 --> 03:09.740
Do you want to go ahead?

03:09.740 --> 03:10.740
Yes, please.

03:10.740 --> 03:15.940
Vasa, I was wondering, what does fairness mean for an AI?

03:15.940 --> 03:21.820
Would it be seen as unfair if an AI does not follow the laws of a society like the law

03:21.820 --> 03:26.540
to not discriminate any people no matter of the race, sex, or gender?

03:26.540 --> 03:34.460
So yes, if I want to define fairness for artificial intelligence, fairness will mean the equality

03:34.460 --> 03:41.140
of treatments for everyone for the less of things that you don't want to include in

03:41.140 --> 03:43.620
your prediction models.

03:43.620 --> 03:48.180
For example, you might want to have a fair artificial intelligence that do not take into

03:48.180 --> 03:57.260
account your gender or your race or your religion or your age or any kind of sensitive attributes.

03:57.260 --> 04:02.420
Do you have an example of how an AI could discriminate someone?

04:02.420 --> 04:05.580
Yes, so I have a couple of examples.

04:05.580 --> 04:09.540
That was the case of racial bias in healthcare a couple of years ago.

04:10.100 --> 04:15.500
This has been reported in a research article whose title is Disacting Racial Bias in

04:15.500 --> 04:19.460
an Algorithm used to manage the health of population.

04:19.460 --> 04:25.500
And in this article, the authors found that widely used algorithm used to assess the risk

04:25.500 --> 04:32.260
of health issues, so the health issues of people had racial bias.

04:32.260 --> 04:38.180
And this algorithm is used to identify high-risk patients, which get more care resources

04:38.180 --> 04:41.660
and attention from the hospital staff.

04:41.660 --> 04:46.780
But unfortunately, the issue with this algorithm is that to get the same risk score as white

04:46.780 --> 04:51.180
people, black people had to be much more sick.

04:51.180 --> 04:57.260
And this is presumably caused by raising the risk estimation on the health of the people,

04:57.260 --> 05:01.060
but also on the estimated health care cost.

05:01.060 --> 05:06.940
So as you can see, AI bias can have important real-world consequences.

05:06.940 --> 05:12.980
And I can give you another example this time in the US justice system.

05:12.980 --> 05:19.020
There is a proprietary software called the Compass used to tell how likely someone is going

05:19.020 --> 05:21.620
to receive the data in their client.

05:21.620 --> 05:27.180
An analysis by Kopelbika revealed that the algorithm was racist.

05:27.180 --> 05:33.860
It turned out that compared to white people, black people at a much higher risk of being

05:33.860 --> 05:41.340
falsely considered as risky criminals that are going to commit their crimes again.

05:41.340 --> 05:47.900
So in other words, the algorithm told that black people were much more dangerous for societies

05:47.900 --> 05:49.940
than white people.

05:49.940 --> 05:56.580
And conversely, white people were often misclassified as low-risk difference, which means unlikely

05:56.580 --> 05:59.060
to receive their crimes.

05:59.060 --> 06:05.340
So the false positive rate was much higher for black people compared to white people and

06:05.340 --> 06:07.900
so reverse for the false negative rate.

06:07.900 --> 06:13.620
Again, this shows that unfair algorithms exist in the wild and that they are using critical

06:13.620 --> 06:15.020
cases.

06:15.020 --> 06:18.500
And on top of that, those two algorithms aren't free software.

06:18.500 --> 06:24.740
OK, before I go over to my next question, could you shortly describe what false positive

06:24.740 --> 06:27.100
and false negative means?

06:27.660 --> 06:34.220
Yes, so to explain false positive and false negative and true positive and two negatives,

06:34.220 --> 06:38.980
I will give you an example based on the spam detection.

06:38.980 --> 06:45.540
So the spam are emails you don't want to see and to tackle spam.

06:45.540 --> 06:51.620
There are some software that is used to classify whether an email is a spam or a legitimate

06:51.620 --> 06:54.060
email.

06:54.060 --> 07:00.500
So if you get a message and it's completely legitimate email, but the software classifies

07:00.500 --> 07:06.860
it as spam, it will be called a false positive because the software thought that the email

07:06.860 --> 07:09.540
was a spam, but it wasn't.

07:09.540 --> 07:17.940
If the email was in fact a spam, but the software thought it was completely legitimate, it will

07:17.940 --> 07:24.900
be called a false negative because the software thought that the email wasn't a spam.

07:24.900 --> 07:30.300
And so the true positive and the true negatives are correct classifications, meaning that the

07:30.300 --> 07:36.420
software correctly classified the emails as spam or legitimate email.

07:36.420 --> 07:43.580
So this is an example that can be used to explain this concept.

07:43.580 --> 07:46.700
Vasa, I also have a question about what you just said.

07:46.700 --> 07:51.580
So I mean you said that sometimes there are mistakes that happen.

07:51.580 --> 07:57.900
But I mean when we look back in history of a humankind, there were a lot of occasions

07:57.900 --> 08:02.580
when humans on purpose discriminated certain groups.

08:02.580 --> 08:08.420
And a lot of that was also done on purpose with architecture, with technical means, like

08:08.420 --> 08:13.060
for example in Lauren Slasik's book with whom we also talked about regulation before

08:13.060 --> 08:20.500
in one of the podcasts, there's an example of how bridges and train lines were used to

08:20.500 --> 08:27.180
make it harder for certain minorities to go to other parts of a city and get better

08:27.180 --> 08:28.380
jobs.

08:28.380 --> 08:34.940
How can we find out if something is done by mistake or if that's on purpose when you

08:34.940 --> 08:38.500
have an AI involved?

08:38.500 --> 08:45.620
So if you have only the new result of the AI, I mean if you have only the predictions,

08:45.620 --> 08:51.420
then you cannot really know the intent like the is the purpose of the predictions.

08:51.420 --> 08:53.140
What you need is a source code.

08:53.140 --> 09:00.700
So free software will help you to know the purpose behind the predictions because you know

09:00.700 --> 09:09.900
what the input of the AI was and you can also know what where the design behind the prediction

09:09.900 --> 09:10.900
model.

09:10.900 --> 09:17.060
So you can guess how the data was processed and how the algorithm was used and that way

09:17.060 --> 09:24.260
you know the purpose of the AI and also you can know how the model was evaluated.

09:24.260 --> 09:31.020
I mean what metric was used to evaluate the performance of the artificial intelligence.

09:31.020 --> 09:38.100
So you can know for example if the true positive rate was the same whether the person was

09:38.100 --> 09:46.420
a male or a female or black or white or whatever and if you can do this kind of test then

09:46.420 --> 09:50.940
you can see if the AI was there.

09:50.940 --> 09:58.820
So by seeing this source code and with transparency then you can guess the purpose.

09:58.820 --> 10:05.780
So this part is now about the demand for transparency you talk about in your presentations, right?

10:05.780 --> 10:13.620
Yeah, I think that the result connection between how transparent an algorithm is and

10:13.620 --> 10:22.180
how fair can be because much like when we talk about security and free software I think

10:22.180 --> 10:28.660
that we need transparency for algorithm to ensure that they are fair.

10:28.660 --> 10:34.100
If you cannot see the source code of AI and if it's not transparent then you cannot

10:34.100 --> 10:36.300
ensure that it will be fair.

10:36.300 --> 10:41.780
Much like you cannot really be sure about the security of the software if you cannot

10:41.780 --> 10:44.340
see the source code of it.

10:44.340 --> 10:49.580
Is it the case that if you have the source code of the AI that this would be sufficient

10:49.580 --> 10:55.420
to understand how it's actually working or do you also need a lot of the training data

10:55.420 --> 11:00.100
or other data where the AI learned from?

11:00.100 --> 11:06.140
I think that to answer your question to really understand the AI you will need three things

11:06.140 --> 11:07.140
basically.

11:07.140 --> 11:13.900
You will need the data that was used to train the AI or if the data is really sensitive

11:13.900 --> 11:20.060
they can you cannot access to it you can have the its characteristics.

11:20.060 --> 11:27.740
So what were the variables and what were their distribution how did they look like?

11:27.740 --> 11:31.860
Then you need to know how the AI was trained.

11:31.860 --> 11:35.860
So what was the source code used to train the AI?

11:35.860 --> 11:40.700
And then you need to be able to evaluate the AI.

11:40.700 --> 11:49.220
You need to have some kind of metric that tells if the AI was accurate and if the accuracy

11:49.220 --> 11:56.820
was the same regardless of some kind of attribute such as your age or gender or any kind

11:56.820 --> 11:59.860
of protected attribute.

11:59.860 --> 12:07.540
From my understanding one thing that AI is able to do is to very quickly adapt and learn

12:07.540 --> 12:10.060
way, way faster than humans are.

12:10.060 --> 12:15.940
So when we are now talking about source code is it correct that that means that in one

12:15.940 --> 12:22.660
time like one minute it is that source code and a few minutes later it's completely different

12:22.660 --> 12:28.580
and the AI might act on different rules or how should I mention that?

12:28.580 --> 12:36.140
So I don't really think that AI learns faster than humans do because I mean if you like

12:36.140 --> 12:43.460
show 10 pictures of cats to a two years old he or she will be able to you know recognize

12:43.460 --> 12:50.860
cats or any kind of animal but for AI you need to put through the algorithm like millions

12:50.860 --> 12:55.980
or billions of images for it to grow any kind of subject.

12:55.980 --> 13:01.180
So I don't think this is generally true that the algorithm is faster.

13:01.180 --> 13:05.940
It just appears to be because we have a lot of computational power so we can use a lot

13:05.940 --> 13:12.300
of computation to train algorithms for days and days and days in data centers.

13:12.300 --> 13:18.620
So for an AI to work you have to train it with the right data, with the right training

13:18.620 --> 13:25.220
code and evaluate its performance in a good way that measure how fair it is and after

13:25.260 --> 13:30.100
you have to monitor its accuracy through this series.

13:30.100 --> 13:38.780
You have to check if the furnace of the algorithm stays the same and if the AI furnace drops

13:38.780 --> 13:46.420
you have to like stop using it and you have to detect it and then retrain your AI with

13:46.420 --> 13:53.380
new data or with a new source code to make sure that the furnace is good.

13:53.380 --> 13:59.580
So yeah, this is the source code of the AI, this change and let's be checked.

13:59.580 --> 14:07.220
So that means that the AI itself would also have to be set up in a way that it's documenting

14:07.220 --> 14:11.540
itself in a way that humans understand that.

14:11.540 --> 14:14.340
Do I understand it correct, Vasa?

14:14.340 --> 14:20.820
Yes, so what you need is to make sure that the AI can give you some kind of metrics or

14:20.820 --> 14:29.540
furnace regularly, like each day you measure the furnace core so that you can have some

14:29.540 --> 14:37.220
kind of measure and you can detect the automatism in the furnace.

14:37.220 --> 14:42.340
I do have a basic question here because you have already mentioned the training data

14:42.340 --> 14:49.060
for an artificial intelligence, who actually trains an artificial intelligence, how should

14:49.060 --> 14:50.900
I imagine that the data looks like?

14:50.900 --> 14:58.140
For example, if you take Alexa, one of the examples you gave at the beginning for an AI,

14:58.140 --> 15:04.500
who trains Alexa, would this be Amazon or is it a person at home?

15:04.500 --> 15:10.100
So an AI is trained both with data and resource code.

15:10.100 --> 15:16.900
Basically, when you are using Alexa or any kind of like a voice recording device, you create

15:16.900 --> 15:20.980
data that is used to train Alexa again.

15:20.980 --> 15:27.660
So you participate in the training of the AI because your data is used, but the training

15:27.660 --> 15:34.180
code is done by Amazon, so because Alexa is proprietary, we can really only guess what

15:34.180 --> 15:41.600
is happening there, but I guess that its data scientist is a trained AI, often with

15:41.600 --> 15:42.600
research tools.

15:42.600 --> 15:54.120
AI is developed a lot with open source software and it's done inside companies by data scientist.

15:54.120 --> 15:59.120
I can imagine when you have lots of data and you have to train such an AI that also means

15:59.120 --> 16:05.560
that you need a lot of processing power from the AI you deal with, is it something that

16:05.560 --> 16:10.640
you can actually run on your computer or do people have to imagine that more like you

16:10.640 --> 16:17.240
need huge data centers to train an AI or how do you have to think about that?

16:17.240 --> 16:21.960
So it depends on what you want and also it depends on the AI itself.

16:21.960 --> 16:28.960
If you want to like reproduce the state of the art, I mean the paper that just was published

16:28.960 --> 16:34.840
last month and will produce all their results, well you can see if you don't have like giant

16:34.840 --> 16:42.040
data centers with entire teams that monitor like computers and stuff, so you need a lot

16:42.040 --> 16:46.560
of money and computing power to do that.

16:46.560 --> 16:52.160
Because you need to train your AI for a lot of time and with a lot of data, by a lot

16:52.160 --> 16:57.920
of data I mean like gigabytes or terabytes of data.

16:57.920 --> 17:04.680
But thankfully you can still with your home computer, I mean with your laptop, you can

17:04.680 --> 17:11.160
still get good results if you have like more modest intent.

17:11.160 --> 17:18.640
Because of improvements in the hardware, like with GPUs, I mean graphical processing

17:18.640 --> 17:25.560
units, getting cheaper and cheaper, you can have powerful machines at home and you can

17:25.560 --> 17:28.480
use them to train some AI.

17:28.480 --> 17:31.360
And that's also possible because of free software.

17:31.360 --> 17:37.560
Because free software is available to you, you can use it yourself and so you can you

17:37.560 --> 17:43.680
can train it on your personal computer and it will work, it will work also because you

17:43.680 --> 17:47.840
can leverage already trained model.

17:47.840 --> 17:55.600
What you can do is to take the already existing models and incorporate them inside your

17:55.600 --> 17:56.600
world.

17:56.600 --> 18:04.760
You can take the state of the art model and just train some part of it to report for your

18:04.760 --> 18:05.760
needs.

18:05.760 --> 18:14.000
So I think that it's a very powerful technique and that makes you able to use AI with your

18:14.000 --> 18:19.840
simple, I mean basic computers and still have amazing results.

18:19.840 --> 18:24.640
So you kind of use pre-trained AI and continue with this.

18:24.640 --> 18:33.360
Yes, you can use retrained AI for a lot of command tasks such as image classification

18:33.360 --> 18:42.000
or like for example an LP model, like natural language processing models that have gathered

18:42.000 --> 18:49.040
a lot of knowledge about language and you can take these giant big models and you can

18:49.040 --> 18:53.720
use them as part of your own model.

18:53.720 --> 18:59.880
For example, there is a very large AI competition that is called ImageNet.

18:59.880 --> 19:09.080
In this competition, you have to classify I think 10,000 different categories of dogs

19:09.080 --> 19:12.920
or animals or objects or things.

19:12.920 --> 19:17.080
So you have 10,000 different things to classify.

19:17.080 --> 19:22.960
And this is a competition done by researchers or scientists.

19:22.960 --> 19:30.060
And so the winner, so the model that is the most accurate at doing that is often released

19:30.060 --> 19:32.560
publicly as free software.

19:32.560 --> 19:38.680
So what you can do if like let's say that you want to classify between two different

19:38.680 --> 19:41.960
things like cats, the three dogs.

19:41.960 --> 19:45.800
For example, you have images of cats and images of dogs.

19:45.800 --> 19:52.520
What you can do instead of starting from scratch is to take these big models and repose

19:52.520 --> 19:54.640
it for your needs.

19:54.640 --> 20:03.360
So you can train only part of it and reduce the 10,000 classification levels to only cats

20:03.360 --> 20:04.960
and dogs.

20:04.960 --> 20:11.320
And that will be much more fast and efficient than starting from scratch.

20:11.320 --> 20:15.040
So I think we are now already partly in the accessibility part.

20:15.040 --> 20:19.200
So I mean, we talked about the fairness, we talked about transparency.

20:19.200 --> 20:24.880
Now with the accessibility, I mentioned, I mean, one part is that the tools are free software.

20:24.880 --> 20:30.000
So you can use them for any purpose that you can understand how they work that you can

20:30.000 --> 20:33.720
share them with others and that you can make modifications.

20:33.720 --> 20:38.400
Is there anything else which is necessary for AI's that they are accessible?

20:39.040 --> 20:39.840
Yes.

20:39.840 --> 20:48.280
So what you need is a powerful hardware, but thankfully, as I said, powerful hardware is getting

20:48.280 --> 20:50.160
cheaper every day.

20:50.160 --> 20:57.480
So you can have accessible hardware that you can use to like train your own artificial

20:57.480 --> 20:59.760
intelligence.

20:59.760 --> 21:07.960
But unfortunately, the drivers for this graphical processing unit comes are proprietary.

21:07.960 --> 21:14.040
Like that, I mean that the software that is used to make your card communicate to your

21:14.040 --> 21:22.880
computer is proprietary, that prevents AI from being fully accessible, unfortunately.

21:22.880 --> 21:31.440
So it makes AI training with software much more complicated that it should be.

21:31.440 --> 21:35.520
So maybe we're sad to summarize it a bit to this point.

21:35.520 --> 21:41.600
So for fairness, what do we need that you haven't fair AI?

21:41.600 --> 21:46.640
So you need to be able to measure the fairness of the AI.

21:46.640 --> 21:52.320
You need to evaluate how fair it is with some kind of score.

21:52.320 --> 21:58.840
And then you need to be able to monitor this score to make sure that it stays the same.

21:58.840 --> 22:05.280
And then you need to make sure that this score has been well established because I mean,

22:05.280 --> 22:08.360
there are multiple definitions of fairness.

22:08.360 --> 22:11.400
And so you can leverage it in different ways.

22:11.400 --> 22:17.720
So you have to agree with all stakeholders to make sure that your fairness definition

22:17.720 --> 22:21.680
is good considering your problem attend.

22:21.680 --> 22:28.880
And then you need to, as I said, monitor the fairness of the software.

22:28.880 --> 22:33.840
Could you also summarize transparency and accessibility for us?

22:33.840 --> 22:34.840
Yes.

22:35.000 --> 22:43.600
Transparency of AI means adding access to the data that was used to train the algorithm.

22:43.600 --> 22:49.760
Or at least be able to know the characteristics of the input data.

22:49.760 --> 22:54.320
Then you need to have access to this whole score of the AI.

22:54.320 --> 23:01.600
And then you need to define a metric that is used to tell if the model is accurate.

23:01.600 --> 23:08.360
And also if it's accurate for every values of a protected attribute.

23:08.360 --> 23:15.080
And then you need to make sure that everything is released as a free software.

23:15.080 --> 23:22.720
And also what is great with regard to transparency is that recently,

23:22.720 --> 23:28.960
with the free software foundation Europe, what we want to do is to have open science.

23:28.960 --> 23:35.320
So open science means to have science accessible to all and to consider software

23:35.320 --> 23:38.480
as a result of the research.

23:38.480 --> 23:44.440
As a citizen, you should be able to have access to the data that was used to the research

23:44.440 --> 23:46.600
and also to its source code.

23:46.600 --> 23:51.560
And all of that was used to create an AI.

23:51.560 --> 23:59.800
And so with these two things, you are able to have access to the artificial intelligence

23:59.800 --> 24:02.040
and to make it transparent.

24:02.040 --> 24:11.720
So to summarize the accessibility point, what you need is to be able to train the AI yourself.

24:11.720 --> 24:15.320
We need to have free software to train AI.

24:15.320 --> 24:22.040
So we need to have full frameworks and methods to train artificial intelligence.

24:22.040 --> 24:28.400
We need also to have cheap and reliable hardware to train artificial intelligence.

24:28.400 --> 24:36.000
And you need to have free drivers to be able to control these GPUs.

24:36.000 --> 24:39.560
Is there any AI out there which implements those three criteria?

24:39.560 --> 24:43.800
So do we have any positive examples there?

24:43.840 --> 24:51.880
So yeah, unfortunately, I don't know any kind of AI that is like for accessible

24:51.880 --> 24:54.280
and transparent at the same time.

24:54.280 --> 25:01.480
And I think it's really bad and we can do much better with regard to these three things.

25:01.480 --> 25:04.760
So yeah, no AI is perfect yet.

25:04.760 --> 25:09.640
Do you know of any upcoming legislations in Europe that are planning on implementing

25:09.640 --> 25:12.640
those three criteria for an AI?

25:12.640 --> 25:14.240
No, unfortunately not.

25:14.240 --> 25:21.600
I'm not aware of any kind of legislation that is ongoing, but fortunately it's a result

25:21.600 --> 25:26.480
because the European Commission released a white paper in February.

25:26.480 --> 25:32.640
Its title is on artificial intelligence, European approach to excellence and trust,

25:32.640 --> 25:35.760
which talks about AI transparency.

25:35.760 --> 25:40.320
And it demands that the data about the data used to train models

25:40.320 --> 25:45.760
and how their accuracy is measured is provided to everyone.

25:45.760 --> 25:51.840
So this is not a legislation, but I think it's an 8.20 right direction.

25:51.840 --> 25:52.880
So there's hope.

25:52.880 --> 25:56.800
Vasa, to wrap it up, what are the biggest challenges you see for free software

25:56.800 --> 26:02.080
in the field of artificial intelligence at the moment?

26:02.080 --> 26:06.240
So I think that artificial intelligence is really powerful.

26:06.240 --> 26:09.120
I mean, we have met a lot of progress.

26:09.120 --> 26:13.760
And it's like in some regard, AI is much better than humans.

26:13.760 --> 26:19.200
Like it can run for hours without any kind of concentration issues.

26:19.200 --> 26:24.160
I mean, it never gets bored and it has a consistent behavior.

26:24.160 --> 26:28.640
And you know, it can remember a lot of information.

26:28.640 --> 26:34.400
So I think that for these points, AI has a lot of advantages

26:34.400 --> 26:37.120
over their models.

26:37.120 --> 26:41.360
But I think that, yeah, AI can be leveraged to improve society.

26:41.360 --> 26:46.000
But I'm afraid of AI for a couple of reasons.

26:46.000 --> 26:50.000
I think that the first one would be aggressive behavior.

26:50.000 --> 26:54.800
So for example, AI systems are employed to filter out, you know,

26:54.800 --> 26:59.440
helpful content or to detect copyright infringement.

26:59.440 --> 27:02.560
And it's done in a non-tomated way.

27:02.560 --> 27:05.280
And with limited human oversight.

27:05.280 --> 27:08.960
And more specifically, for example, YouTube use AI

27:08.960 --> 27:12.400
to detect unauthorized use of copyright materials.

27:12.400 --> 27:14.320
But sometimes it gets things wrong.

27:14.320 --> 27:18.480
And it doesn't understand things like priorities or means

27:18.480 --> 27:21.360
or more generally they're used.

27:21.360 --> 27:27.040
I think that being able to test AI and measure its furnace

27:27.040 --> 27:31.120
and be able to detect when it gets things wrong

27:31.200 --> 27:34.000
is one big challenge for a furnace.

27:34.000 --> 27:37.200
One point I'm also thinking a little bit about is

27:37.200 --> 27:40.960
when people or companies say, well, we don't know

27:40.960 --> 27:44.880
why this was the result of our software.

27:44.880 --> 27:47.680
It's so complex, we cannot understand it anymore.

27:47.680 --> 27:51.600
So we're sorry about that, but it was the AI.

27:51.600 --> 27:55.520
So when people say something like that, do you think that's true?

27:55.520 --> 27:58.640
Or do you think that this is something they

27:58.640 --> 28:00.000
rather use as an apology?

28:02.720 --> 28:06.000
So I think that the decade I grew it was true

28:06.000 --> 28:10.560
because we weren't able to really understand the AI.

28:10.560 --> 28:14.720
I mean AI can sometimes give a lot of good predictions.

28:14.720 --> 28:16.640
But we are not able to interpret it.

28:16.640 --> 28:20.000
Because the neural networks and the technologies

28:20.000 --> 28:23.120
used to make predictions are so complex

28:23.120 --> 28:27.600
that we are not able to interpret the results.

28:27.680 --> 28:33.840
In a way that we aren't able to connect the input to the output.

28:33.840 --> 28:38.400
I mean, how we are able to know what in the input

28:38.400 --> 28:40.240
led to the prediction.

28:41.040 --> 28:43.360
But I think that we are getting better at this.

28:43.360 --> 28:49.440
And we are researching ways to interpret the results of the AI.

28:49.440 --> 28:55.280
So if companies or people want to not to take responsibility for that,

28:55.280 --> 28:58.880
it's probably rather that maybe they don't know

28:58.880 --> 29:03.600
at the moment why certain decisions are happening like that.

29:03.600 --> 29:06.960
But they also maybe don't want to know at the moment.

29:06.960 --> 29:08.480
Because if they would like to know,

29:08.480 --> 29:10.320
they would have the means to find out

29:10.320 --> 29:12.640
why certain decisions are made by the AI.

29:15.600 --> 29:20.000
Yes, yes, but I think that it boils down to too many.

29:20.640 --> 29:24.560
I think that being able to produce a system

29:24.560 --> 29:28.240
that is interpretable costs a lot of money.

29:28.240 --> 29:29.760
And it takes a lot of time.

29:29.760 --> 29:35.120
And so you need to be able to spend money

29:35.120 --> 29:38.240
to create powerful AI that are well designed,

29:39.120 --> 29:42.720
that are transparent, that are fair, accessible,

29:42.720 --> 29:45.360
and that you are able to interpret.

29:46.160 --> 29:51.840
So I think that one issue with this is time and money.

29:52.560 --> 29:55.280
If you now think about what we talked

29:55.280 --> 29:59.200
and maybe also about how AI without free software

29:59.200 --> 30:01.440
could shape and control our future,

30:01.440 --> 30:05.040
are you then afraid of the increasing usage of artificial intelligence

30:05.040 --> 30:06.160
in our society?

30:07.360 --> 30:11.360
I think that with this issue with our full AI

30:11.360 --> 30:16.240
that are perpetually and that don't have any kind of human oversight.

30:16.960 --> 30:21.360
So with the danger, because as I gave examples earlier,

30:21.440 --> 30:26.320
artificial intelligence has a lot of consequences in our world.

30:26.320 --> 30:31.520
And sometimes it's good, but sometimes it's leads to mistakes

30:31.520 --> 30:34.480
or things that we don't want to see.

30:35.280 --> 30:39.200
And I think that it's a bit scary, to be honest,

30:39.200 --> 30:43.760
to have these systems that we aren't able to access

30:43.760 --> 30:48.240
and we aren't able to inspect because they are appropriate.

30:49.200 --> 30:52.160
And also I'm a bit scared about AI

30:52.160 --> 30:54.480
because of its impact on the environment.

30:55.200 --> 30:59.200
Because a lot of jobs will be replaced with AI at some point.

30:59.760 --> 31:02.640
And I hope that we will find a way to not put people

31:02.640 --> 31:05.040
whose jobs might become irrelevant

31:05.040 --> 31:06.560
in an embarrassing situation.

31:08.400 --> 31:11.200
And how about an AI that would be free software?

31:11.200 --> 31:12.320
Would you then be afraid?

31:14.240 --> 31:17.680
A bit less, because with free software,

31:18.240 --> 31:21.600
we are able to inspect how the AI works.

31:22.240 --> 31:25.760
And so we are able to take a lot of issues.

31:25.760 --> 31:28.240
We are with a proprietary AI.

31:28.960 --> 31:32.640
And with this, we can visual how accurate it is,

31:32.640 --> 31:34.080
how fair it is.

31:34.080 --> 31:36.960
And I think that it should be mandatory

31:36.960 --> 31:41.680
and it's a much less scary to have AI that are open and accessible.

31:41.680 --> 31:44.880
So Vesa, unfortunately, we are coming to the end.

31:44.880 --> 31:49.520
So I think this topic is a big challenge for human freedoms.

31:50.000 --> 31:53.920
And I'm not sure yet how exactly AI should look in future.

31:54.720 --> 31:57.120
I think on the way there, we will learn a lot

31:57.120 --> 32:00.080
and also make some good and some bad experience.

32:00.080 --> 32:03.120
But in general, the idea you're promoting

32:03.120 --> 32:05.440
that supporting people building AI

32:05.440 --> 32:08.000
that is accessible, transparent and fair

32:08.000 --> 32:10.400
seems like a good first step for humankind.

32:11.360 --> 32:14.880
Even if that process might then sometimes be slower

32:14.880 --> 32:17.040
if you don't apply those criteria.

32:17.760 --> 32:21.600
So thank you already very much for talking with us about AI.

32:22.320 --> 32:26.000
In our podcast, we always, at the end, have one question.

32:26.720 --> 32:29.760
And I would also like to ask that to you.

32:29.760 --> 32:34.160
So as our regular visitors know, on the 14th of February,

32:34.160 --> 32:36.720
we always celebrate the I love free software day

32:37.440 --> 32:41.760
so that not just the flower industry benefits from this day.

32:41.760 --> 32:45.120
And we use this day to thank free software developers

32:45.120 --> 32:47.440
and communities out there for the effort and work

32:47.440 --> 32:49.840
to making our society a better place to live.

32:50.480 --> 32:52.720
But of course, the 14th of September

32:52.720 --> 32:55.360
shouldn't be the only day where you thank people

32:55.360 --> 32:57.200
for their work for free software.

32:57.200 --> 32:58.400
I wanted to ask you the question,

32:58.400 --> 32:59.920
is there any software out there

32:59.920 --> 33:04.240
or any developer out there whom you would like to thank or to mention?

33:07.680 --> 33:10.320
Yes, so I'd like to mention a few software.

33:10.960 --> 33:13.680
So I want to thank Perras,

33:13.680 --> 33:16.800
the Artificial Intelligence Framework.

33:16.800 --> 33:19.920
So it's a software that is used to build

33:19.920 --> 33:21.680
Artificial Intelligence very easily.

33:22.240 --> 33:26.880
And I'm also very grateful for the by-dodge

33:26.880 --> 33:27.760
developerism.

33:27.760 --> 33:30.320
I think it's a project for me from Facebook.

33:30.320 --> 33:34.800
And also to the TensorFlow software done by Google.

33:34.880 --> 33:36.960
And I'm deeply thankful for this

33:36.960 --> 33:40.160
because I'm based in my PhD project on those software.

33:40.800 --> 33:42.720
And so far, it's been working great.

33:43.520 --> 33:47.920
And I'm also really thankful for the modular community

33:47.920 --> 33:50.720
for developing the Firefox web browser

33:50.720 --> 33:52.960
because it's a web browser that I already like.

33:53.520 --> 33:55.280
Because it's free software.

33:55.280 --> 33:57.520
It respects your privacy.

33:57.520 --> 33:58.720
It's powerful.

33:58.720 --> 33:59.840
It's fast.

33:59.840 --> 34:00.240
So yeah.

34:00.240 --> 34:00.880
Thank you, Vasa.

34:03.440 --> 34:03.920
You're welcome.

34:04.960 --> 34:06.080
Thank you, Vasa.

34:06.080 --> 34:08.640
We're talking with us about Artificial Intelligence

34:08.640 --> 34:09.440
and Free Software.

34:10.320 --> 34:12.480
This was the software Freedom Podcast.

34:12.480 --> 34:15.840
If you liked this episode, please recommend it to your friends

34:15.840 --> 34:16.560
and rate it.

34:17.200 --> 34:20.320
Also subscribe to make sure you will get the next episode.

34:20.960 --> 34:24.400
This podcast is presented to you by the Free Software Foundation Europe,

34:24.400 --> 34:27.440
where a charity that works on promoting software freedom.

34:27.440 --> 34:30.880
If you like our work, please consider supporting us with a donation.

34:30.880 --> 34:33.440
You'll find more information under fsafety.org,

34:33.520 --> 34:34.560
slash the need.

34:34.560 --> 34:35.600
Thank you very much.

34:35.600 --> 34:36.800
Thank you very much, Vasa.

Back to the episode SFP#7