Scrape Attribute

From UBot Studio
(Difference between revisions)
Jump to: navigation, search
(Example 2: Scraping with Regular Expressions)
(Example 1)
 
(One intermediate revision by one user not shown)
Line 13: Line 13:
  
 
<pre>
 
<pre>
set(#my item, $scrape attribute(<href="/files/ScriptReferences//TheSelectors.pdf">, "name"), "Global")
+
navigate("http://ubotstudio.com/resources","Wait")
 +
set(#my item,$scrape attribute(<innertext="Video Training in Ten Minutes or Less!">,"innertext"),"Global")
 
</pre>
 
</pre>
  
Line 20: Line 21:
  
  
[[File:scrapeattribute.png]]
+
[[File:scrattri.jpg]]
  
  
 
<pre>
 
<pre>
add list to list(%my list, $scrape attribute(<href=w"/files/ScriptReferences/*.pdf">, "innertext"), "Delete", "Global")
+
navigate("http://wiki.ubotstudio.com/wiki/Main_Page","Wait")
 +
add list to list(%my list,$scrape attribute(<href=w"/wiki/*">,"innertext"),"Delete","Global")
 
</pre>
 
</pre>
  
Line 32: Line 34:
  
  
[[File:scrapeattribute0.png]]
+
[[File:scrattri0.jpg]]
 
+
  
 
== Example 2: Scraping with Regular Expressions ==
 
== Example 2: Scraping with Regular Expressions ==

Latest revision as of 17:25, 24 December 2016

$Scrape Attribute is a Browser Function.

The function returns the value of a specified element on a selected element. As with all other scraping, you will be scraping the data to a list or variable.

The Element Selector is used to select an item for the scrape attribute.

Element To Scrape: The element on the page selected for scraping

Attribute To Scrape: The attribute of the selected page element that is going to be scraped (for example, the innerhtml)

[edit] Example 1

navigate("http://ubotstudio.com/resources","Wait")
set(#my item,$scrape attribute(<innertext="Video Training in Ten Minutes or Less!">,"innertext"),"Global")


Running the script sets a variable named "my item". The scrape attribute scrapes the selected element by the innertext. If the innertext of the item is available on the page, the innertext is scraped to the variable, as seen in the debugger.


Scrattri.jpg


navigate("http://wiki.ubotstudio.com/wiki/Main_Page","Wait")
add list to list(%my list,$scrape attribute(<href=w"/wiki/*">,"innertext"),"Delete","Global")


This script will scrape all attributes that meet the wildcard criteria by their innertext. All scraped items are placed in a list and each one is treated as an individual list item.


Scrattri0.jpg

[edit] Example 2: Scraping with Regular Expressions

To use Regex to find an element on a webpage for scraping, simply click the Advanced Editor option, select the attribute the Regex will be matching and click the Exact Match drop down to select the Regular Expressions option.


Scregex.gif


navigate("http://listofrandomnames.com/index.cfm?generated", "Wait")
wait(3)
click(<type="submit">, "Left Click", "No")
wait(3)
set(#blue, $scrape attribute(<innertext=r"^(?<FirstName>\\w+)\\s(?<LastName>\\w+)$">, "innertext"), "Global")

Running the script produces the following list of random proper case words:

More Tools
Alden Gero
Deanna Badillo
Earlean Schulman
Kimberley Maney
Treva Belnap
Sharon Kempker
Shemika Anderton
Cleopatra Eberhardt
Zenobia Molloy
Sherell Vanepps
Jenette Belfiore
Rosamond Boyden
Jama Pless
Delma Brightwell
Scot Elswick
Ivy Peed
Melodie Cendejas
Fernande Wimmer
Stephine Twiggs
Ashlea Strasburg
Lorem Ipsum
Joe Apple

Notice that the value "More Tools" was returned.

That is because the two values match the criteria of two proper case words as specified by the regex.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox