Scrape Attribute

From UBot Studio
(Difference between revisions)
Jump to: navigation, search
(Example 1)
 
(6 intermediate revisions by one user not shown)
Line 3: Line 3:
 
The function returns the value of a specified element on a selected element.
 
The function returns the value of a specified element on a selected element.
 
As with all other scraping, you will be scraping the data to a list or variable.
 
As with all other scraping, you will be scraping the data to a list or variable.
The [[Element Selector|The Selectors]] is used to select an item for the scrape attribute.
 
  
== Example ==
+
The [[The Selectors|Element Selector]] is used to select an item for the scrape attribute.
 +
 
 +
'''Element To Scrape:''' The element on the page selected for scraping
 +
 
 +
'''Attribute To Scrape:''' The attribute of the selected page element that is going to be scraped (for example, the innerhtml)
 +
 
 +
== Example 1==
  
 
<pre>
 
<pre>
set(#my item, $scrape attribute(<href="/files/ScriptReferences//TheSelectors.pdf">, "name"), "Global")
+
navigate("http://ubotstudio.com/resources","Wait")
 +
set(#my item,$scrape attribute(<innertext="Video Training in Ten Minutes or Less!">,"innertext"),"Global")
 
</pre>
 
</pre>
  
Line 15: Line 21:
  
  
[[File:scrapeattribute.png]]
+
[[File:scrattri.jpg]]
  
  
 
<pre>
 
<pre>
add list to list(%my list, $scrape attribute(<href=w"/files/ScriptReferences/*.pdf">, "innertext"), "Delete", "Global")
+
navigate("http://wiki.ubotstudio.com/wiki/Main_Page","Wait")
 +
add list to list(%my list,$scrape attribute(<href=w"/wiki/*">,"innertext"),"Delete","Global")
 
</pre>
 
</pre>
  
Line 27: Line 34:
  
  
[[File:scrapeattribute0.png]]
+
[[File:scrattri0.jpg]]
 +
 
 +
== Example 2: Scraping with Regular Expressions ==
 +
 
 +
 
 +
To use Regex to find an element on a webpage for scraping, simply click the Advanced Editor option, select the attribute the Regex will be matching and click the Exact Match drop down to select the Regular Expressions option.
 +
 
 +
 
 +
[[File:scregex.gif]]
 +
 
 +
 
 +
<pre>
 +
navigate("http://listofrandomnames.com/index.cfm?generated", "Wait")
 +
wait(3)
 +
click(<type="submit">, "Left Click", "No")
 +
wait(3)
 +
set(#blue, $scrape attribute(<innertext=r"^(?<FirstName>\\w+)\\s(?<LastName>\\w+)$">, "innertext"), "Global")
 +
</pre>
 +
 
 +
Running the script produces the following list of random proper case words:
 +
 
 +
<pre>
 +
More Tools
 +
Alden Gero
 +
Deanna Badillo
 +
Earlean Schulman
 +
Kimberley Maney
 +
Treva Belnap
 +
Sharon Kempker
 +
Shemika Anderton
 +
Cleopatra Eberhardt
 +
Zenobia Molloy
 +
Sherell Vanepps
 +
Jenette Belfiore
 +
Rosamond Boyden
 +
Jama Pless
 +
Delma Brightwell
 +
Scot Elswick
 +
Ivy Peed
 +
Melodie Cendejas
 +
Fernande Wimmer
 +
Stephine Twiggs
 +
Ashlea Strasburg
 +
Lorem Ipsum
 +
Joe Apple
 +
</pre>
 +
 
 +
Notice that the value "More Tools" was returned.
 +
 
 +
That is because the two values match the criteria of two proper case words as specified by the regex.

Latest revision as of 17:25, 24 December 2016

$Scrape Attribute is a Browser Function.

The function returns the value of a specified element on a selected element. As with all other scraping, you will be scraping the data to a list or variable.

The Element Selector is used to select an item for the scrape attribute.

Element To Scrape: The element on the page selected for scraping

Attribute To Scrape: The attribute of the selected page element that is going to be scraped (for example, the innerhtml)

[edit] Example 1

navigate("http://ubotstudio.com/resources","Wait")
set(#my item,$scrape attribute(<innertext="Video Training in Ten Minutes or Less!">,"innertext"),"Global")


Running the script sets a variable named "my item". The scrape attribute scrapes the selected element by the innertext. If the innertext of the item is available on the page, the innertext is scraped to the variable, as seen in the debugger.


Scrattri.jpg


navigate("http://wiki.ubotstudio.com/wiki/Main_Page","Wait")
add list to list(%my list,$scrape attribute(<href=w"/wiki/*">,"innertext"),"Delete","Global")


This script will scrape all attributes that meet the wildcard criteria by their innertext. All scraped items are placed in a list and each one is treated as an individual list item.


Scrattri0.jpg

[edit] Example 2: Scraping with Regular Expressions

To use Regex to find an element on a webpage for scraping, simply click the Advanced Editor option, select the attribute the Regex will be matching and click the Exact Match drop down to select the Regular Expressions option.


Scregex.gif


navigate("http://listofrandomnames.com/index.cfm?generated", "Wait")
wait(3)
click(<type="submit">, "Left Click", "No")
wait(3)
set(#blue, $scrape attribute(<innertext=r"^(?<FirstName>\\w+)\\s(?<LastName>\\w+)$">, "innertext"), "Global")

Running the script produces the following list of random proper case words:

More Tools
Alden Gero
Deanna Badillo
Earlean Schulman
Kimberley Maney
Treva Belnap
Sharon Kempker
Shemika Anderton
Cleopatra Eberhardt
Zenobia Molloy
Sherell Vanepps
Jenette Belfiore
Rosamond Boyden
Jama Pless
Delma Brightwell
Scot Elswick
Ivy Peed
Melodie Cendejas
Fernande Wimmer
Stephine Twiggs
Ashlea Strasburg
Lorem Ipsum
Joe Apple

Notice that the value "More Tools" was returned.

That is because the two values match the criteria of two proper case words as specified by the regex.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox