Chapter Seven UI element interactions. In this chapter, we will be giving an introduction of what UI element interactions are, what are the different input methods, what are the different output methods that are supported by UiPath and also the data scraping or, as we call it web scraping. Introduction. In any programming language, the interaction with the user interface is broadly divided into two categories. First, is the input that is inserting data into an external application. Input refers to the action where a user or a robot in this case, take some action or an external application or a web page, such as clicking, typing, sending access modifier keys etc.
And this makes the application behave in a certain fashion and process the commands provided to it and the second is output that is receiving data from an external application. Output consists of actions that extract information out of the application and puts it back into the UiPath program for display or for further processing. The output activities are non recordable, since the main task they accomplish is making information visible to the user for reading. As discussed before UiPath incorporates a number of activities to simulate the actions a human would perform manually on a system. However, there are three major methods for input functionality. Like for with the infrastructure that includes the clicking and typing activities, namely, hardware events, which is the default method for input, send window messages and simulate hardware events or default.
This is the input Method chosen as default by the UiPath studio. It is a highly reliable method that works every time. In this, the input action provided within the workflow is simulated assets by the robot utilizing the hardware drivers of keyboard and mouse installed in the operating system. This means actual typing and clicking takes place, and you will see the mouse pointer moving over the screen. As a result, this activity is a foreground process and doesn't work in the background. The performance of this activity in terms of the speed of operation is rather slow.
Another quick thing to note is that in this input activity, empty field parameter needs to be set in the Properties panel. As we discussed in the basic recording, topic of our last chapter, as well, as the previously written data in a text field is not automatic. wiped out. Right? So to give a sample of how it works, we have the UiPath studio here. And let's say we have a notepad, which is this window.
And we're gonna take a few input activities. And let's say we want to write Hello, right. And we want to empty the field as well. And here is the click operation. And the click operation, what it's going to do is it's going to minimize the notepad window. Right?
And we have, let's say another type into activity, which is going to type world so we want to Type hello world, but in between we want to minimize the MS notepad window as well. Okay. So what you can see is in here that we have a type into activity which is going to type into the notepad. window. It is gonna let me Yeah. So we're gonna type into this notepad window saying hello.
And then we'll be minimizing the window and we will be typing the word world. Right. As you can see, the last two parameters are the attributes of the properties in all these three input activities, like the last two are saying that send windows Invest in simulate type. So, if we are unchecking them then by default which are like the default method, the hardware you know input method is going to be taken into consideration. Right. So, as we said, let's run and clearly observe see a little meticulously we have to see the mouse pointer and how the overall operation is going to work.
Right. So let's run it for the time being, you can see Hello is written it minimized and again open the window and wrote down world. So for any operation to actually occur, the window needs to be active. Right the target application needs to be active for the input operation to perform. So you see, the mouse pointer also moved Let's do it one more time to see that you know, the mouse pointer is also going to move like hello is written mouse actually clicked and wrote down stuff and then so mouse or the keyboard cursor is Gotta you know, be pointed to the direction where the actual operation needs to take place, and then the actual operation takes place. Right.
So that's how the default method works. It is very reliable because it is gonna work pretty much every time. The only downside is that it cannot be, you know, a background process. And the speed of operation is relatively slow because we are interacting through the GUI elements of the target application to perform our operations. Right. And that's how the default operation works to showcase the other ones.
We, as you know, I'm pretty sure it must be clear now how the other methods are going to work. The other method is called the Send window message method, right. So this method can be activated by checking the Send video messages checkbox, as I just showed in the Properties panel. Right, the way it works is that UiPath robot will send a specific message to the target application, right like notepad in our previous example. And the target application, in turn, responds to the message by performing those actions requested from it. So the upside is that for this method is that you know, it is it can be work, it can happen in the background, and generally works faster than the default method.
However, the difference in speed is not significant, and the process is not fully reliable. This is because the compatibility factor with the target application comes into picture here. So, as you can see, we'll take the same example. And this because initially let's say we want to keep the window active itself, so it doesn't matter much if we have sent window messages or what, but let's say for the click operation, we use the Send window messages. So, the board is going to send a message to notepad saying that, you know, could you please minimize the window, and even the text for the world that we want to write in the notepad document is also going to be sent window messages, right. As I said, there is a possibility that, you know, the reliability factor for this input method is a little low, like say 80% 80% out of hundred times it's gonna work but again, there's a pretty decent chance it might not.
So we'll see how it works I want to show you and we're going to make it work as well. So what we're going to do is, we have this open we have also cleared the empty field. So if you're cleaning it for the first time itself, that should be enough because we want hello world both to be printed here. So first time we are emptying the field and next time we're not even emptying the field because we won't put the message otherwise it would wipe out the Hello as well. Which was it in the first activity, and we'll just write world, right? So let's try and see if it works or not.
We run this, and hello is written, and then we got an exception. So it's saying that this message action is not supported by this type of element, you have to use another type of click. So that's where the reliability factor comes into place. Right? Because it's in the notepad is sort of an application which UiPath cannot interact by sending messages, especially in terms of the clicking operation that needs to happen. So we have to resort to some other method, right?
It could be the last method, which we are going to discuss in a few minutes, or, by default, like the default method. So let's keep that as this. And let's, for the next, like the last typing activity will still keep it the Send window messages option, right. Let's see if that works or not. And we run The program again, it's going to wipe out this Hello, minimize, and it showed the bat, right. So, if you saw, unlike previous time, the, the notepad window was not supposed to be active.
It was not supposed to be active it was, it didn't stay active and the operation still worked. The world was still printed the world keyword, right. So that means the application was not active, it means it supports the background processing. And typing activity for notepad as a target application works perfectly by using send send window messages input method. So if something like that happens, it's always advisable to use the faster and an activity which is gonna give a better throughput. Because, you know, when we are building large projects, these factors actually make a pretty significant difference.
Right? So that's how we use send window messages and third The last method that is supported the input method that is supported within UiPath is called simulate method. So, in simulate method, this is like the fastest among the three input methods used in UiPath. And whenever selected, the action to be performed is mimicked by using the technology of target application. So, the target application would feel as if somebody is clicking on it but in actual in reality, our drivers which are installed like the mouse drivers or the keyboard drivers which are in like installed in our operating system won't actually come into play. It's gonna make it feel the targets gonna make target application feel like the actions are like going on to it but actually, it's just a simulation stuff, right it efficiently works in the background.
The reliability of the operation would work for sure is leased in the Method due to the dependency in behavior of target application, right? So the reliability factor, let's say is around 70%, there's a 30% chance that the activity, the operation might not work. And the major reason why many times this method is not used is because it doesn't support keyboard shortcut activities. Like send hotkey with modifiers, or functions and stuff. These things don't usually are supported in this simulate type method. Right.
But as I said, this is like the fastest method. So if it works, you have to like, you know, try here and try. And if it works, then definitely it's a better choice to go with this. Because it can actually make a pretty significant difference in terms of the efficiency and the throughput of getting your operations performed. And one another thing that happens in this simulate activity is I'm going to show you with this example, that let's say we have kept both These activities the same way, right? The first, like typing Hello, and emptying it.
And the third one, we want to keep it as simulate. And technically we should, we should get the same result hello world, right? And because we are not emptying the field here, in the similar type method, let's see if that works or not. It just ran the program, it said hello minimized and the bot is shown again. So if you go and check out the output, it only shows world. Why?
Because simulate input method has a feature that it automatically wipes the old data and writes its own data. It's like inbuilt in it. So every time even if you don't check even if you check the empty field, it's not gonna matter. All the data in the like that pre exists in a text field in text area or in a text line in this case is going to be doubt. So, the thing is, it works fine in this one, but that say, let's say that's not what we want, right? So what's gonna be the alternative approach, if you still want to use the simulate type method.
Then usually in projects, what people do is after let's say this activity, because you need to write this activity down in the simulate method, you use a method called get text. Right, and in get text, you're going to select the element from where the data needs to be captured, which is gonna be in here, and you're going to store its value in a variable, this is a Ctrl k as the shortcut to get the output of this variable, and you can put it as MSG. Right, and now, in here in the type into, you can put something like MSG plus word. And you have to connect these activities as well. So you got the text from the, whatever the pre written text is in the field. And then you add it to the type into activity where the similar type method is used.
And that way, you'll get the whole hello world data written in it. And you see it worked in the background. If we check that out, it's still it says hello world now. So that's an alternative approach. And still gonna be very fast, because if you would have been writing a lot of text, and if there were like multiple operations that were going on on the same target application or molecule was working on it, trust me, it would make a significant difference. So that's how those are the three input methods that are supported within UiPath.
And they're relatively straightforward and That's how, you know we use them. A quick summary on how these input methods work is that you know, the default method has a reliability of hundred percent, the speed is relatively slow, there is no background execution that is supported. There is no automatic field data raised, that's gonna happen. And yes, you can send hotkeys like you know, Ctrl, Alt or some combination of modifier keys along with your usual keyboard keys. For the Send window messages, the reliability is a little slow, is a little low, and the speed is decent. It's not too bad.
And it supports background execution. There is no automatic field of data raise. And it also supports hotkey input simulate method is the fastest, but the reliability factor is a little on the lower side. It's seven to 70%. It supports background execution. There is it's the only input method that's going to automatically erase data From your field, there is selected as the UI element where the input operation needs to take place, and it doesn't support any hotkeys input as well.
Okay, output methods. As I briefly discussed before, there are a number of ways data is extracted from the external application to the workflow for display or for the computation. This external application could be a local desktop app, web browser, text document, image or probably virtual environment. The data involved in this extraction process could be huge, with severe complexity and wrapped under complex user interface. Also, there may be times when the user would need to extract metadata, such as color of text positioning of a word or other associated details for such tasks UiPath provides the amazing capability of screen scraping, which facilitates user Find the best data output method to solve a business problem. The three screen scraping methods available within UiPath are full text, native and OCR full text.
This is the default output method for screen scraping. It is highly reliable and is most frequently used. The speed of operation is very fast, and the job is done with the complete accuracy. We can also include hidden data from the selected UI elements and extracted. Another positive feature of full text output method is that this process runs in background, which means application is not needed to be active. Native native method provides user with the capability to extract metadata for the text available on the screen, such as screen coordinates, like of each word or character, right, unlike full text output method, this process doesn't work in background and the speed of operation is slower but In general, it is still considered to be pretty fast.
The operations are performed with complete accuracy similar to the full text method. Now comes the beast. OCR also discussed in Citrix automation. OCR extracts data from an image. It is not fully reliable as the text from the image may get misinterpreted by the workflow. All the technology has evolved significantly in this domain.
It is usually chosen as the last resort. If the other two output methods don't work. It is often used to automate processes where complete accuracy is not of paramount importance. There are two available OCR engines pre installed are Microsoft OCR and Google OCR and Microsoft OCR is suitable for data scraping in large images, which got huge amount of text like scan files, invoices etc. While Google OCR is mostly used to screen data from small low quality images like UI elements. And, you know small images which have got very less data in them.
A few distinguishing features of Google OCR includes invert option and scale parameter. Invert option, once checked, can invert a white on black image background, as black text on white background can be read more efficiently by computer programs. And scale option is used to increase image resolution by enlarging it and filling gaps with additional pixels. Let's take a sample use case to understand all the theory that we have just gone through in detail. So consider that we have an application like this open in here, and we want to extract the data out of it. So we go to screen scraping and we select the UI element which is the whole window.
And once we select the UI element, We correspondingly get the screen scraping wizard with the best, you know, scraping methodology, or the output methodology that UiPath thinks would be the best. So it took full text, which is the most reliable one. And it is showing this all the data that we can see in here. Even the maximize minimize and the close buttons. if let's say there was a drop down menu, this full text method would have actually shown all the values of that particular drop down menu as well. So you can take out all that extra trivial information by checking this ignore hidden refresh, and you see, only the ones that is available in here are shown in and we can correspondingly generate a table by finishing it.
And if we need to get a particular data, then we can access that particular row by because that's going to remain the same the overall structure of an application remains the same or as long as it remains the same. This method is always gonna work. Right? Let's check out the other activity, the native activity. If we refresh it, that's how it is it is even taking these spaces into consideration, right? So if that is how you need the even the, you know, the placement or the coordinates, if you want to see the words info, you can check that out as well.
It's showing the screen coordinates to to, you know, to make it visible as to where a particular component is available within the window. Right. And for the OCR. Let's take Google OCR and see if that works or not. We go with the let's go with the you know, default options. Let's take out the world's information.
And if you check out the Google OCR, we see most of it is correct. It's able to get the options our new order, instead of V. It's taking As backslash, then quit and option is being shown, right. So in cases where if it was an image, or if it was running in a, you know, virtual environment, of course, that would have been a great option. And we can always play around to see if we want to increase the resolution in check, whether that's gonna work better or lower. In our case, you see, instead of Wi Fi, it's not taking it as W. So it looks much better because the option other than the option we got the option number as well, and we got a pair down, we can try to invert it and see if you know, it's automatically internally to itself, it's gonna make the color changes from like, the black background is gonna turn white and this pipe will turn black.
And automatically there are like quite a few times it happens that the data is read more accurately. As you can see in this case, as well. It became a view suddenly Now, all the labels are read correctly by the Google OCR engine, right? We can also try the Microsoft OCR just to see how it works and if in case it gives a better result, you see, there are like few commas and and there's nothing included says a UT. So, you know you have to make those choices and you have to choose the OCR engine. Just to give a quick example how, because OCR is always the most is like the most toughest one.
I'm going to refresh this. I'm going to see um, it seems like an okay result I wanted to know, let's say my business case is that I want to know what are the three different options that are provided to me. So, I just finished this because all three of them are displayed correctly. I finish it and here is the sequence that is generated. Once I open it, here it is that you know the window has been chosen. Then like in the batch window and it's getting the OCR text.
That's the activity that has been used to get the text using the OCR engine. And once we use the get OCR text, in the next line we have to tell which OCR engine which is also provided as a separate activity. The engine itself, we have to mention which or which engine it's like Google or Microsoft, which OCR engine we are going to use to do the scraping action. Right. And if you want to check that out, you can see that in the get OCR text window, we are getting the text s training order system e xe, right that's the text and if you want to check how it looks like you can simply create a new order system e xe and if you run this So, we can see the string that means OCR engine to a, to a pretty decent extent, is able to scrape the data from this is taking it as an image and is giving us all the data from it.
Right. So that's how the OCR engines work. And you need to mention the engine for sure. I also want to emphasize that instead of OCR scraping method, the output method, if we would have taken, let's say, the full text output method, then the activity that would have been generated is get full text. Right. So like, it would have attached to the window, it would have selected the whole window itself to scrape the data out of it, and then the activity would have been get full text.
And if you want to check out if once the data has been scraped, we can Use the right line or message box activity. Similarly, for native method, we get this get visible text. So in native, we get this visible text and we can, you know, check how the string has been formulated, or if, in some cases you know, like in the OCR if a table has been generated or not, but that's how the data is scraped. And as you have seen in the basic recording, usually, the get text method is used. web scraping or data scraping. web scraping works differently then other scraping output methods discussed before.
The Full Text native and OCR methods are used to extract freeform data. While web scraping technique acts on structured data. Structured Data refers to the information Put together in an organized fashion. It follows certain pattern that makes it easier to store operate upon and retrieve whenever required. The best example of the structure data would be a table with some header information, providing context to the stored information in terms of its labels and associated fields. So for the use case, we will consider that we want to extract First, let's say hundred Google search results of something.
So, let's say we haven't done explorer and Google open and we searched something chocolate pudding. Okay. And in here, we want to get certain data out of this whole like, you know, like If we want to extract first hundred search results, and we want they're like the title of the search and their link, if available. Right, which seems it is. So in that case, what we're gonna do is, because it's structured, right, all this, all the search results are following pretty much the same pattern and, you know, title a link. And if it is, you know, and then some data about it.
So we'll use data scraping. And by the way, you know, in Community Edition of UiPath. web scraping is named as data scraping. So it is basically the same icon at the same place and pretty much just exactly like the same functionality. So now, we want to scrape the data, and it's asking us to select the element and the first element. So we'll choose next and we let's say select The, you know, the title of the search result, select the second element, again to identify the pattern.
So that, you know, subsequent scraping would be done. So this second scraping would be for this element. And let's name this column as title. Right? It makes sense. Next, and we are getting the data, right?
We want to have, let's say 400 results or to make it easier, we will make like 450 results, right? And we also want the link, right? Whatever is displayed on the homepage, there might be some dark dots, but that's how it is. So we want to extract the correlated data, and we want the link in here. Right? Let's select the second element to help you bypass recognize the pattern and we selected the urinal as well.
Right next That's how it looks like. Right? And now we want to finish. But this time it is again asking the same thing whether you know, the data spanning across multiple pages. Yes, it is spanning across multiple pages because we have like eight or 10 results or something and we want to store we want to scrape first 50 results. So, we're going to indicate it as Yes.
And we're going to select the next link. So now automatically the sequence has been generated, we set it as the start node and once we get into it, it's going to attach to the browser, where you know, the chip chocolate pudding is already the search for the chocolate pudding is already been done. And it's going to extract the structured data out of it right and every the whole parameters you can see that the maximum number of results you want is 500. We have a next link selected also which automatically means generated by UiPath. We also have a selector for, you know, the the main element, the title element that has been chosen. So here's the timeout activity.
If you know by any case, we don't find any element or it's, we don't want it to wait for more than, you know, 10 seconds, then we can put that parameter in milliseconds here. So it's going to be, you know, 10,000. And that way, it will wait for 10 seconds. If it doesn't, it's gonna pop up the error message, and something like that. Right. So we have the data table being shown in the extract data table, which is automatically been created by UiPath.
To retrieve this, we'll be discussing this in the data tables, you know, later, but for each row is what we were using, right? We did touch upon it a little bit. So for each row in extract data table, right, so that's how we are traversing into the extract data table row by row, and to get the data item, because we have two different columns, so we have another activity called get provider. And what it does is it gets that particular show you that particular data from that particular row. And we can put any of these three, either we can put the name of the column, I always use index because, you know, that makes it more sort of robust and reliable. So index zero is gonna give me the name of the link right, the title of the link.
So I'm gonna create a new variable in here, and I'm gonna name it as title, right, created automatically. If I select title, generic value, I could have made it as text as well, but would get the job done this time. And then I get another row item, select the row, because that's what it's traversing right. So let's say during the first iteration, it's the first row during the second iteration, it's gonna be the second row. And I want to access from that second row, this column number, which is one. And it's going to be the URL.
So again, set name and put this as URL. All right, and I want to print, you know, the whole table, or all all the elements that are there in the table. So what I'm going to do is I'm going to add a right line activity as well. You can see that title, right plus, we're concatenating within this space in between, and then plus, and then URL, right. So I'll be getting once it's like in the wild while it's traversing through each and every row. It's going to print those items as well.
Right within the rows. So that's the way you display the items within a data table. And we've got the data extracted. Here we are. And we have the chocolate pudding page also open. Let's try and run it.
So automatically, the data will be taken, scraped, and it's going to the second page, if you see data scraped, it went to the third page, everything is being done automatically, it's automatically clicking the next page, page four, once it reaches maximum results of 50, page five, and Page Six, it's scraped the data and done. So in six pages, we got the first 50 results of chocolate pudding, Google search, right? If you want to check how the output looks like, there it is. Right? There it is. There's the link.
Here's the title, here's the link and so on. Right. So that's how the data scraping is done. And we scraping is actually very important feature. You know, whenever we have a structured data, we can always get the data out of it in a table that we can compute further or simply display. It's a very easy, reliable and very fast method to extract data out of a web application.
Right. And there are like multiple use cases, usually what we do is once we get a data table, we you know, we don't even have to traverse through it within the program, we simply write the whole Excel you know, the whole data table into an Excel spreadsheet or to a CSV file. And you know, either it could be used to, to be input to another enterprise application or could be sent out to different people as a as a generated report and dynamic report. You know, there has been pulled out of the web application just now. So you see the The possibilities are tremendous. Endless.
And we can change the number of entries to be extracted by checking the properties of extract structured data activity and modifying the maximum number of results attribute, which is, you know, right here to, let's say, have first hundred results or first thousand results, there is another quick trick that can actually help in saving a lot of your time and effort as well. Right? Suppose the data available on a web page is very well structured, that is in the form of a table with rows and columns, then there's this quick trick helps to extract the whole table in just one single, you know, quick. So for an example, let's say we are looking at the historical stock data for Apple, and it's in the form of a table right? So what we do is in the UiPath studio, we go to data scraping tool, and just select the element we click on Next.
And instead of selecting the first element, we select any data element anywhere across the whole table, right? We select it, and we get a note saying that you selected a table. So, would you like to extract the whole data from the table right? So, if you want that, you select Yes. And you see the whole table has actually been extracted very nicely, very beautifully. Right, you can simply finish it and work on it, the extracted data table would be available to you for you know, to play around to operate upon or to simply display.
Right. So pretty much same steps were covered in web recording, right, but instead of using a recorder to generate activities for us, here, we have created the whole working solution from scratch. We are getting there, right? We are actually getting better at this and the chances of it working with no errors and being optimal, increased significantly over recorded works. Flow accomplishing the same task.