Tableau

Voilà le résultat de notre script final:

Tableau numero 1
Fichier source ./URLS/anglais.txt
Num URL URL Pages Aspirées DUMP Initial (non utf-8) DUMP UTF-8 CONTEXTES CONTEXTES HTML Nb Occurences
1 URL 1 Page Aspirée 1 utf-8 1-utf8.txt 1.txt 1-utf8.html 436
2 URL 2 Page Aspirée 2 utf-8 2-utf8.txt 2.txt 2-utf8.html 162
3 URL 3 Page Aspirée 3 utf-8 3-utf8.txt 3.txt 3-utf8.html 28
4 URL 4 Page Aspirée 4
5 URL 5 Page Aspirée 5 us-ascii 5-utf8.txt 5.txt 5-utf8.html 3
6 URL 6 Page Aspirée 6 utf-8 6-utf8.txt 6.txt 6-utf8.html 106
7 URL 7 Page Aspirée 7 utf-8 7-utf8.txt 7.txt 7-utf8.html 0
8 URL 8 Page Aspirée 8 utf-8 8-utf8.txt 8.txt 8-utf8.html 56
9 URL 9 Page Aspirée 9 utf-8 9-utf8.txt 9.txt 9-utf8.html 17
10 URL 10 Page Aspirée 10 utf-8 10-utf8.txt 10.txt 10-utf8.html 75
11 URL 11 Page Aspirée 11 utf-8 11-utf8.txt 11.txt 11-utf8.html 11
12 URL 12 Page Aspirée 12 utf-8 12-utf8.txt 12.txt 12-utf8.html 42
13 URL 13 Page Aspirée 13 utf-8 13-utf8.txt 13.txt 13-utf8.html 6
14 URL 14 Page Aspirée 14 utf-8 14-utf8.txt 14.txt 14-utf8.html 60
15 URL 15 Page Aspirée 15 utf-8 15-utf8.txt 15.txt 15-utf8.html 20
16 URL 16 Page Aspirée 16 utf-8 16-utf8.txt 16.txt 16-utf8.html 37
17 URL 17 Page Aspirée 17 us-ascii 17-utf8.txt 17.txt 17-utf8.html 50
18 URL 18 Page Aspirée 18 utf-8 18-utf8.txt 18.txt 18-utf8.html 28
19 URL 19 Page Aspirée 19 utf-8 19-utf8.txt 19.txt 19-utf8.html 31
20 URL 20 Page Aspirée 20 utf-8 20-utf8.txt 20.txt 20-utf8.html 29
21 URL 21 Page Aspirée 21 utf-8 21-utf8.txt 21.txt 21-utf8.html 41
22 URL 22 Page Aspirée 22 utf-8 22-utf8.txt 22.txt 22-utf8.html 6
23 URL 23 Page Aspirée 23 us-ascii 23-utf8.txt 23.txt 23-utf8.html 27
24 URL 24 Page Aspirée 24 utf-8 24-utf8.txt 24.txt 24-utf8.html 5
25 URL 25 Page Aspirée 25 utf-8 25-utf8.txt 25.txt 25-utf8.html 10
26 URL 26 Page Aspirée 26 utf-8 26-utf8.txt 26.txt 26-utf8.html 101
27 URL 27 Page Aspirée 27 utf-8 27-utf8.txt 27.txt 27-utf8.html 22
28 URL 28 Page Aspirée 28 utf-8 28-utf8.txt 28.txt 28-utf8.html 22
29 URL 29 Page Aspirée 29 us-ascii 29-utf8.txt 29.txt 29-utf8.html 59
30 URL 30 Page Aspirée 30 utf-8 30-utf8.txt 30.txt 30-utf8.html 34
31 URL 31 Page Aspirée 31 utf-8 31-utf8.txt 31.txt 31-utf8.html 37
32 URL 32 Page Aspirée 32 us-ascii 32-utf8.txt 32.txt 32-utf8.html 21
33 URL 33 Page Aspirée 33 us-ascii 33-utf8.txt 33.txt 33-utf8.html 21
34 URL 34 Page Aspirée 34 us-ascii 34-utf8.txt 34.txt 34-utf8.html 19
35 URL 35 Page Aspirée 35
36 URL 36 Page Aspirée 36 us-ascii 36-utf8.txt 36.txt 36-utf8.html 86
37 URL 37 Page Aspirée 37 utf-8 37-utf8.txt 37.txt 37-utf8.html 147
38 URL 38 Page Aspirée 38 us-ascii 38-utf8.txt 38.txt 38-utf8.html 27
39 URL 39 Page Aspirée 39 us-ascii 39-utf8.txt 39.txt 39-utf8.html 14
40 URL 40 Page Aspirée 40 utf-8 40-utf8.txt 40.txt 40-utf8.html 161
41 URL 41 Page Aspirée 41 utf-8 41-utf8.txt 41.txt 41-utf8.html 23
42 URL 42 Page Aspirée 42
43 URL 43 Page Aspirée 43 utf-8 43-utf8.txt 43.txt 43-utf8.html 47
44 URL 44 Page Aspirée 44 utf-8 44-utf8.txt 44.txt 44-utf8.html 26
45 URL 45 Page Aspirée 45 utf-8 45-utf8.txt 45.txt 45-utf8.html 35
46 URL 46 Page Aspirée 46 us-ascii 46-utf8.txt 46.txt 46-utf8.html 6
47 URL 47 Page Aspirée 47 utf-8 47-utf8.txt 47.txt 47-utf8.html 65
48 URL 48 Page Aspirée 48 us-ascii 48-utf8.txt 48.txt 48-utf8.html 7
49 URL 49 Page Aspirée 49 utf-8 49-utf8.txt 49.txt 49-utf8.html 36
50 URL 50 Page Aspirée 50 us-ascii 50-utf8.txt 50.txt 50-utf8.html 5
 Fichier DUMP
global

47 fichier(s)
Fichier CONTEXTES
global

47 fichier(s)
Tableau numero 2
Fichier source ./URLS/arabe.txt
Num URL URL Pages Aspirées DUMP Initial (non utf-8) DUMP UTF-8 CONTEXTES CONTEXTES HTML Nb Occurences
1 URL 1 Page Aspirée 1 utf-8 1-utf8.txt 1.txt 1-utf8.html 6
2 URL 2 Page Aspirée 2 utf-8 2-utf8.txt 2.txt 2-utf8.html 8
3 URL 3 Page Aspirée 3 windows-1256 3-utf8.txt 3.txt 3-utf8.html 0
4 URL 4 Page Aspirée 4 utf-8 4-utf8.txt 4.txt 4-utf8.html 12
5 URL 5 Page Aspirée 5 utf-8 5-utf8.txt 5.txt 5-utf8.html 0
6 URL 6 Page Aspirée 6 utf-8 6-utf8.txt 6.txt 6-utf8.html 8
7 URL 7 Page Aspirée 7 utf-8 7-utf8.txt 7.txt 7-utf8.html 2
8 URL 8 Page Aspirée 8 windows-1256 8-utf8.txt 8.txt 8-utf8.html 8
9 URL 9 Page Aspirée 9 utf-8 9-utf8.txt 9.txt 9-utf8.html 0
10 URL 10 Page Aspirée 10 windows-1256 10-utf8.txt 10.txt 10-utf8.html 0
11 URL 11 Page Aspirée 11 utf-8 11-utf8.txt 11.txt 11-utf8.html 0
12 URL 12 Page Aspirée 12 utf-8 12-utf8.txt 12.txt 12-utf8.html 1
13 URL 13 Page Aspirée 13 utf-8 13-utf8.txt 13.txt 13-utf8.html 2
14 URL 14 Page Aspirée 14 windows-1256 14-utf8.txt 14.txt 14-utf8.html 0
15 URL 15 Page Aspirée 15 utf-8 15-utf8.txt 15.txt 15-utf8.html 0
16 URL 16 Page Aspirée 16 utf-8 16-utf8.txt 16.txt 16-utf8.html 0
17 URL 17 Page Aspirée 17 windows-1256 17-utf8.txt 17.txt 17-utf8.html 0
18 URL 18 Page Aspirée 18 utf-8 18-utf8.txt 18.txt 18-utf8.html 0
19 URL 19 Page Aspirée 19 utf-8 19-utf8.txt 19.txt 19-utf8.html 4
20 URL 20 Page Aspirée 20 windows-1256 20-utf8.txt 20.txt 20-utf8.html 0
21 URL 21 Page Aspirée 21 utf-8 21-utf8.txt 21.txt 21-utf8.html 5
22 URL 22 Page Aspirée 22 windows-1256 22-utf8.txt 22.txt 22-utf8.html 0
23 URL 23 Page Aspirée 23 utf-8 23-utf8.txt 23.txt 23-utf8.html 2
24 URL 24 Page Aspirée 24 utf-8 24-utf8.txt 24.txt 24-utf8.html 0
25 URL 25 Page Aspirée 25 windows-1256 25-utf8.txt 25.txt 25-utf8.html 5
26 URL 26 Page Aspirée 26 utf-8 26-utf8.txt 26.txt 26-utf8.html 2
27 URL 27 Page Aspirée 27 utf-8 27-utf8.txt 27.txt 27-utf8.html 0
28 URL 28 Page Aspirée 28 windows-1256 28-utf8.txt 28.txt 28-utf8.html 0
29 URL 29 Page Aspirée 29 windows-1252 29-utf8.txt 29.txt 29-utf8.html 1
30 URL 30 Page Aspirée 30 utf-8 30-utf8.txt 30.txt 30-utf8.html 0
31 URL 31 Page Aspirée 31 windows-1256 31-utf8.txt 31.txt 31-utf8.html 0
32 URL 32 Page Aspirée 32 windows-1256 32-utf8.txt 32.txt 32-utf8.html 0
33 URL 33 Page Aspirée 33 utf-8 33-utf8.txt 33.txt 33-utf8.html 1
34 URL 34 Page Aspirée 34 utf-8 34-utf8.txt 34.txt 34-utf8.html 1
35 URL 35 Page Aspirée 35 windows-1256 35-utf8.txt 35.txt 35-utf8.html 0
36 URL 36 Page Aspirée 36
37 URL 37 Page Aspirée 37 utf-8 37-utf8.txt 37.txt 37-utf8.html 9
38 URL 38 Page Aspirée 38 us-ascii 38-utf8.txt 38.txt 38-utf8.html 0
39 URL 39 Page Aspirée 39 windows-1256 39-utf8.txt 39.txt 39-utf8.html 10
40 URL 40 Page Aspirée 40 windows-1256 40-utf8.txt 40.txt 40-utf8.html 0
41 URL 41 Page Aspirée 41 utf-8 41-utf8.txt 41.txt 41-utf8.html 0
42 URL 42 Page Aspirée 42 utf-8 42-utf8.txt 42.txt 42-utf8.html 0
43 URL 43 Page Aspirée 43 windows-1256 43-utf8.txt 43.txt 43-utf8.html 0
44 URL 44 Page Aspirée 44 windows-1256 44-utf8.txt 44.txt 44-utf8.html 0
45 URL 45 Page Aspirée 45 utf-8 45-utf8.txt 45.txt 45-utf8.html 0
46 URL 46 Page Aspirée 46
47 URL 47 Page Aspirée 47 utf-8 47-utf8.txt 47.txt 47-utf8.html 0
48 URL 48 Page Aspirée 48 utf-8 48-utf8.txt 48.txt 48-utf8.html 0
49 URL 49 Page Aspirée 49 utf-8 49-utf8.txt 49.txt 49-utf8.html 0
50 URL 50 Page Aspirée 50 utf-8 50-utf8.txt 50.txt 50-utf8.html 0
 Fichier DUMP
global

48 fichier(s)
Fichier CONTEXTES
global

48 fichier(s)
Tableau numero 3
Fichier source ./URLS/francais.txt
Num URL URL Pages Aspirées DUMP Initial (non utf-8) DUMP UTF-8 CONTEXTES CONTEXTES HTML Nb Occurences
1 URL 1 Page Aspirée 1 utf-8 1-utf8.txt 1.txt 1-utf8.html 98
2 URL 2 Page Aspirée 2
3 URL 3 Page Aspirée 3
4 URL 4 Page Aspirée 4 utf-8 4-utf8.txt 4.txt 4-utf8.html 8
5 URL 5 Page Aspirée 5
6 URL 6 Page Aspirée 6
7 URL 7 Page Aspirée 7 utf-8 7-utf8.txt 7.txt 7-utf8.html 13
8 URL 8 Page Aspirée 8 us-ascii 8-utf8.txt 8.txt 8-utf8.html 36
9 URL 9 Page Aspirée 9 utf-8 9-utf8.txt 9.txt 9-utf8.html 59
10 URL 10 Page Aspirée 10 utf-8 10-utf8.txt 10.txt 10-utf8.html 15
11 URL 11 Page Aspirée 11 utf-8 11-utf8.txt 11.txt 11-utf8.html 26
12 URL 12 Page Aspirée 12 utf-8 12-utf8.txt 12.txt 12-utf8.html 113
13 URL 13 Page Aspirée 13 utf-8 13-utf8.txt 13.txt 13-utf8.html 46
14 URL 14 Page Aspirée 14 utf-8 14-utf8.txt 14.txt 14-utf8.html 9
15 URL 15 Page Aspirée 15 utf-8 15-utf8.txt 15.txt 15-utf8.html 24
16 URL 16 Page Aspirée 16
17 URL 17 Page Aspirée 17 utf-8 17-utf8.txt 17.txt 17-utf8.html 28
18 URL 18 Page Aspirée 18 utf-8 18-utf8.txt 18.txt 18-utf8.html 14
19 URL 19 Page Aspirée 19 utf-8 19-utf8.txt 19.txt 19-utf8.html 30
20 URL 20 Page Aspirée 20 utf-8 20-utf8.txt 20.txt 20-utf8.html 19
21 URL 21 Page Aspirée 21
22 URL 22 Page Aspirée 22
23 URL 23 Page Aspirée 23 utf-8 23-utf8.txt 23.txt 23-utf8.html 53
24 URL 24 Page Aspirée 24 utf-8 24-utf8.txt 24.txt 24-utf8.html 21
25 URL 25 Page Aspirée 25
26 URL 26 Page Aspirée 26 utf-8 26-utf8.txt 26.txt 26-utf8.html 66
27 URL 27 Page Aspirée 27
28 URL 28 Page Aspirée 28
29 URL 29 Page Aspirée 29 utf-8 29-utf8.txt 29.txt 29-utf8.html 20
30 URL 30 Page Aspirée 30 utf-8 30-utf8.txt 30.txt 30-utf8.html 7
31 URL 31 Page Aspirée 31 utf-8 31-utf8.txt 31.txt 31-utf8.html 9
32 URL 32 Page Aspirée 32 utf-8 32-utf8.txt 32.txt 32-utf8.html 8
33 URL 33 Page Aspirée 33 utf-8 33-utf8.txt 33.txt 33-utf8.html 21
34 URL 34 Page Aspirée 34 utf-8 34-utf8.txt 34.txt 34-utf8.html 27
35 URL 35 Page Aspirée 35
36 URL 36 Page Aspirée 36 utf-8 36-utf8.txt 36.txt 36-utf8.html 26
37 URL 37 Page Aspirée 37
38 URL 38 Page Aspirée 38
39 URL 39 Page Aspirée 39 utf-8 39-utf8.txt 39.txt 39-utf8.html 18
40 URL 40 Page Aspirée 40 us-ascii 40-utf8.txt 40.txt 40-utf8.html 34
41 URL 41 Page Aspirée 41 utf-8 41-utf8.txt 41.txt 41-utf8.html 11
42 URL 42 Page Aspirée 42
43 URL 43 Page Aspirée 43 utf-8 43-utf8.txt 43.txt 43-utf8.html 27
44 URL 44 Page Aspirée 44 utf-8 44-utf8.txt 44.txt 44-utf8.html 12
45 URL 45 Page Aspirée 45 utf-8 45-utf8.txt 45.txt 45-utf8.html 28
46 URL 46 Page Aspirée 46
47 URL 47 Page Aspirée 47 utf-8 47-utf8.txt 47.txt 47-utf8.html 44
48 URL 48 Page Aspirée 48
49 URL 49 Page Aspirée 49 utf-8 49-utf8.txt 49.txt 49-utf8.html 23
50 URL 50 Page Aspirée 50
51 URL 51 Page Aspirée 51 utf-8 51-utf8.txt 51.txt 51-utf8.html 109
52 URL 52 Page Aspirée 52 utf-8 52-utf8.txt 52.txt 52-utf8.html 51
 Fichier DUMP
global

35 fichier(s)
Fichier CONTEXTES
global

35 fichier(s)
Tableau numero 4
Fichier source ./URLS/russe.txt
Num URL URL Pages Aspirées DUMP Initial (non utf-8) DUMP UTF-8 CONTEXTES CONTEXTES HTML Nb Occurences
1 URL 1 Page Aspirée 1 utf-8 1-utf8.txt 1.txt 1-utf8.html 50
2 URL 2 Page Aspirée 2
3 URL 3 Page Aspirée 3 utf-8 3-utf8.txt 3.txt 3-utf8.html 13
4 URL 4 Page Aspirée 4 windows-1251 4-utf8.txt 4.txt 4-utf8.html 52
5 URL 5 Page Aspirée 5 windows-1251 5-utf8.txt 5.txt 5-utf8.html 37
6 URL 6 Page Aspirée 6
7 URL 7 Page Aspirée 7 utf-8 7-utf8.txt 7.txt 7-utf8.html 14
8 URL 8 Page Aspirée 8 windows-1251 8-utf8.txt 8.txt 8-utf8.html 38
9 URL 9 Page Aspirée 9 utf-8 9-utf8.txt 9.txt 9-utf8.html 12
10 URL 10 Page Aspirée 10 utf-8 10-utf8.txt 10.txt 10-utf8.html 23
11 URL 11 Page Aspirée 11 windows-1251 11-utf8.txt 11.txt 11-utf8.html 28
12 URL 12 Page Aspirée 12 utf-8 12-utf8.txt 12.txt 12-utf8.html 47
13 URL 13 Page Aspirée 13 utf-8 13-utf8.txt 13.txt 13-utf8.html 27
14 URL 14 Page Aspirée 14 utf-8 14-utf8.txt 14.txt 14-utf8.html 26
15 URL 15 Page Aspirée 15 utf-8 15-utf8.txt 15.txt 15-utf8.html 82
16 URL 16 Page Aspirée 16 utf-8 16-utf8.txt 16.txt 16-utf8.html 22
17 URL 17 Page Aspirée 17 windows-1251 17-utf8.txt 17.txt 17-utf8.html 70
18 URL 18 Page Aspirée 18
19 URL 19 Page Aspirée 19 windows-1251 19-utf8.txt 19.txt 19-utf8.html 8
20 URL 20 Page Aspirée 20
21 URL 21 Page Aspirée 21 us-ascii 21-utf8.txt 21.txt 21-utf8.html 0
22 URL 22 Page Aspirée 22 utf-8 22-utf8.txt 22.txt 22-utf8.html 23
23 URL 23 Page Aspirée 23 utf-8 23-utf8.txt 23.txt 23-utf8.html 36
24 URL 24 Page Aspirée 24 utf-8 24-utf8.txt 24.txt 24-utf8.html 7
25 URL 25 Page Aspirée 25 utf-8 25-utf8.txt 25.txt 25-utf8.html 5
26 URL 26 Page Aspirée 26 windows-1251 26-utf8.txt 26.txt 26-utf8.html 35
27 URL 27 Page Aspirée 27 utf-8 27-utf8.txt 27.txt 27-utf8.html 17
28 URL 28 Page Aspirée 28 utf-8 28-utf8.txt 28.txt 28-utf8.html 4
29 URL 29 Page Aspirée 29 utf-8 29-utf8.txt 29.txt 29-utf8.html 8
30 URL 30 Page Aspirée 30 windows-1251 30-utf8.txt 30.txt 30-utf8.html 46
31 URL 31 Page Aspirée 31 utf-8 31-utf8.txt 31.txt 31-utf8.html 18
32 URL 32 Page Aspirée 32 utf-8 32-utf8.txt 32.txt 32-utf8.html 19
33 URL 33 Page Aspirée 33 utf-8 33-utf8.txt 33.txt 33-utf8.html 5
34 URL 34 Page Aspirée 34 utf-8 34-utf8.txt 34.txt 34-utf8.html 55
35 URL 35 Page Aspirée 35 windows-1251 35-utf8.txt 35.txt 35-utf8.html 102
36 URL 36 Page Aspirée 36 utf-8 36-utf8.txt 36.txt 36-utf8.html 214
37 URL 37 Page Aspirée 37
38 URL 38 Page Aspirée 38 windows-1251 38-utf8.txt 38.txt 38-utf8.html 4
39 URL 39 Page Aspirée 39 windows-1251 39-utf8.txt 39.txt 39-utf8.html 16
40 URL 40 Page Aspirée 40 windows-1251 40-utf8.txt 40.txt 40-utf8.html 10
41 URL 41 Page Aspirée 41 utf-8 41-utf8.txt 41.txt 41-utf8.html 7
42 URL 42 Page Aspirée 42 utf-8 42-utf8.txt 42.txt 42-utf8.html 14
43 URL 43 Page Aspirée 43
44 URL 44 Page Aspirée 44
45 URL 45 Page Aspirée 45 utf-8 45-utf8.txt 45.txt 45-utf8.html 34
46 URL 46 Page Aspirée 46 utf-8 46-utf8.txt 46.txt 46-utf8.html 3
47 URL 47 Page Aspirée 47 utf-8 47-utf8.txt 47.txt 47-utf8.html 40
48 URL 48 Page Aspirée 48 windows-1251 48-utf8.txt 48.txt 48-utf8.html 23
49 URL 49 Page Aspirée 49 windows-1251 49-utf8.txt 49.txt 49-utf8.html 17
50 URL 50 Page Aspirée 50 utf-8 50-utf8.txt 50.txt 50-utf8.html 79
 Fichier DUMP
global

43 fichier(s)
Fichier CONTEXTES
global

43 fichier(s)