1
00:00:00,120 --> 00:00:06,690
If you Guillermin Web scraping on your own, you probably know that the three main Web scrapie, libraries

2
00:00:06,690 --> 00:00:12,990
in Python are beautiful soup, selenium and scrapie, but which one is the best?

3
00:00:13,200 --> 00:00:15,510
Let's find out in this video.

4
00:00:15,690 --> 00:00:16,800
The first library.

5
00:00:16,830 --> 00:00:17,450
We'll see.

6
00:00:17,490 --> 00:00:25,440
It's beautiful soup, beautiful soup can pull data out of each e-mail in XML files in this library is

7
00:00:25,440 --> 00:00:27,370
the best choice for beginners.

8
00:00:27,420 --> 00:00:32,360
This is because beautiful soup is the easiest web scraping library in Python.

9
00:00:32,550 --> 00:00:37,500
Do you only need a few lines of code to scrape a website with beautiful soup?

10
00:00:37,510 --> 00:00:44,700
And it's also easy to set up because you only need to install the requests library in the beautiful

11
00:00:44,700 --> 00:00:45,560
soup library.

12
00:00:45,780 --> 00:00:48,740
And with these you can start scraping a website.

13
00:00:48,750 --> 00:00:54,450
Unfortunately, VideoClip doesn't have support for JavaScript driven websites.

14
00:00:54,660 --> 00:01:00,300
This is a big disadvantage because nowadays many websites run JavaScript.

15
00:01:00,330 --> 00:01:07,420
Also beautiful soup is inefficient and it has some dependencies that make it complicated to transfer

16
00:01:07,420 --> 00:01:09,030
a code between projects.

17
00:01:09,210 --> 00:01:11,040
Next, we have selenium.

18
00:01:11,040 --> 00:01:14,290
Selenium wasn't actually designed for web scraping.

19
00:01:14,310 --> 00:01:22,020
In fact, selenium is a Web driver designed to render Web pages for test automation of Web applications.

20
00:01:22,260 --> 00:01:29,340
This makes selenium great for web scraping because many websites rely on JavaScript to create dynamic

21
00:01:29,340 --> 00:01:30,780
content on the page.

22
00:01:31,020 --> 00:01:37,020
So we can say that selenium is one of the best libraries for scraping JavaScript driven websites.

23
00:01:37,290 --> 00:01:42,090
Another advantage of selenium is that is easier to learn than the scrapie.

24
00:01:42,240 --> 00:01:45,260
Unfortunately, selenium is a slow web.

25
00:01:45,270 --> 00:01:52,730
Scraping with selenium is a slower than HTP request to the web browser because all the scripts present

26
00:01:52,800 --> 00:01:55,210
on the Web page will be executed.

27
00:01:55,230 --> 00:02:01,170
However, if it isn't our top priority, selenium will be a good option.

28
00:02:01,410 --> 00:02:03,330
Finally, we have a scrapie.

29
00:02:03,420 --> 00:02:11,160
A Scrapie is a website Crippin framework built especially for web scraping and written entirely in Python.

30
00:02:11,400 --> 00:02:16,350
This is without a doubt the most complete web scraping tool in Python.

31
00:02:16,560 --> 00:02:22,020
Unfortunately, a scrapie is harder to learn than selenium or very forsook.

32
00:02:22,530 --> 00:02:30,180
That said, one of the biggest advantages of a scrapie is the speed, since it's a synchronous is is

33
00:02:30,180 --> 00:02:37,080
partners don't have to wait to make requests one at a time, but it can make requests in parallel.

34
00:02:37,260 --> 00:02:44,760
This increases efficiency, which makes it scrapie memory and CPU efficient compared to the previous

35
00:02:44,760 --> 00:02:46,020
web scraping tools.

36
00:02:46,260 --> 00:02:50,150
Also, scrapie is the most complete framework in Python.

37
00:02:50,310 --> 00:02:56,490
You can easily store data in databases, create crullers and do more with scrapie.

38
00:02:56,610 --> 00:02:58,200
So which one is the best?

39
00:02:58,350 --> 00:03:03,180
I will say every web scraping tool satisfies a specific need.

40
00:03:03,330 --> 00:03:07,020
For example, beautiful soup will be great for beginners.

41
00:03:07,020 --> 00:03:13,470
Selenia will be good for small projects that need to scrape JavaScript driven websites, while scrapie

42
00:03:13,470 --> 00:03:18,030
will be great for large projects where speed is priority.

43
00:03:18,270 --> 00:03:25,290
And the TYT in this video to learn the differences between Biddeford Soup, selenium and scrapie, choose

44
00:03:25,290 --> 00:03:28,080
the web scraping tool that fits better.

45
00:03:28,110 --> 00:03:29,970
The website you wish to scrape.

