Text Parsing
Parse any text or document with a few lines of code.
Text Parsing is a general-purpose parsing engine. It uses text inputs to extract meaningful semantic entities to define, design, and customize your own parsed document object.
The most common example lies in standardizing and enriching job offers from your databases. Additionally, by bringing out semantic entities from an unstructured job offer, such as companies, locations, tasks, skills, and more, creating efficient dashboards and reports can be done with ease.
Prerequisites
API Endpoint
Get more information about the endpoint 🧠 Parse a raw Text.
Step 1: Configure your Postman Environment
Following the steps from the HrFlow.ai Postman publication will make you land on this page:
First, click on the "Environments" tab on the left side of your Postman window. Then, fill in the Empty - Environment template with the correct values. The compulsory variables for Text Parsing are:
x-api-key
: follow the steps from 🔑 API Authentication to retrieve itx-user-email
: follow the steps from 🔑 API Authentication to retrieve it
Finally, save the environment and ensure that you selected Empty - Environment as your current environment.
Step 2: Get your First Text Parsing Results
Fill in your body parameters in a raw format. The body contains only one key named text
associated with the text you want to parse.
The result of your parsing lies into data
and contains:
text
: the text sent to the APIents
: the list Parsing entities detected by our APIparsing
: the lists of entities grouped by their type
Data Fields
- The
parsing
field gives all the list of entities extracted from your text.- The
ents
field targets more advanced applications that require the position of entities within your text.
Each Parsing entity from the ents
field is composed by three informations:
start
: the beggining of the entity in thetext
end
: the end of the entity in thetext
label
: the type of entity (e.g.JobTitle
,Company
,Location
, etc)
For example, given the following Response:
{
"code": 200,
"message": "Text parsing results",
"data": {
"ents": [
{
"end": 19,
"label": "job_title",
"start": 0
},
{
"end": 118,
"label": "company",
"start": 99
},
{
"end": 171,
"label": "location",
"start": 120
},
{
"end": 190,
"label": "phone",
"start": 177
},
{
"end": 209,
"label": "phone",
"start": 196
},
{
"end": 236,
"label": "email",
"start": 217
},
{
"end": 273,
"label": "skill_hard",
"start": 257
}
],
"parsing": {
"certifications": [],
"companies": [
"Stanford University"
],
"courses": [],
"dates": [],
"durations": [],
"education_titles": [],
"emails": [
"[email protected]"
],
"first_names": [],
"interests": [],
"job_titles": [
"Assistant Professor"
],
"languages": [],
"last_names": [],
"locations": [
"Room 156, Gates Building 1A Stanford, CA 94305-9010"
],
"phones": [
"(650)725-2593",
"(650)725-1449"
],
"schools": [],
"skills_hard": [
"Machine learning"
],
"skills_soft": [],
"tasks": []
},
"text": "Assistant Professor\nComputer Science Department Department of Electrical Engineering (by courtesy)\nStanford University.\nRoom 156, Gates Building 1A Stanford, CA 94305-9010\nTel: (650)725-2593\nFAX: (650)725-1449\nemail: [email protected]\nResearch interests: Machine learning, broad competence artificial intelligence, reinforcement learning and robotic control, algorithms for text and web data processing."
}
}
{
"text": "Assistant Professor\nComputer Science Department Department of Electrical Engineering (by courtesy)\nStanford University.\nRoom 156, Gates Building 1A Stanford, CA 94305-9010\nTel: (650)725-2593\nFAX: (650)725-1449\nemail: [email protected]\nResearch interests: Machine learning, broad competence artificial intelligence, reinforcement learning and robotic control, algorithms for text and web data processing."
}
The first entity is a JobTitle
starting from 0 and till the 19th character (excluded) of the following text:
Assistant Professor
Computer Science Department Department of Electrical Engineering (by courtesy)
Stanford University.
Room 156, Gates Building 1A Stanford, CA 94305-9010
Tel: (650)725-2593
FAX: (650)725-1449
email: [email protected]
Research interests: Machine learning, broad competence artificial intelligence, reinforcement learning and robotic control, algorithms for text and web data processing.
Thus, the first Parsed element is the JobTitle
Assistant Professor from the text.
Building a structured object naturally follows by iterating through all the ents
returned by the Text Parsing API.
Advanced Topics
1. Try Text Parsing in your Favorite Programming Language
You can use Postman to work with your favorite programming language. Here is an example with Python.
import requests
import json
url = "https://api.hrflow.ai/v1/text/parsing"
payload = json.dumps({
"text": "Assistant Professor\nComputer Science Department Department of Electrical Engineering (by courtesy)\nStanford University.\nRoom 156, Gates Building 1A Stanford, CA 94305-9010\nTel: (650)725-2593\nFAX: (650)725-1449\nemail: [email protected]\nResearch interests: Machine learning, broad competence artificial intelligence, reinforcement learning and robotic control, algorithms for text and web data processing."
})
headers = {
'X-USER-EMAIL': 'YOUR_USER_EMAIL',
'X-API-KEY': 'YOUR_SECRET_KEY',
'Content-Type': 'application/json',
'Cookie': 'AWSALB=ykgwzj4n2IGp3TS3mWeLT3fmots2OHJsUFYIHuU70Wdqy72GDRnJYp5717+ixhHtiUV/qTOAS0ZagUbfFn71eY6dtxqPlSpj1cgxR4Apyh1o8bN4/BK7K1Fd4KIE; AWSALBCORS=ykgwzj4n2IGp3TS3mWeLT3fmots2OHJsUFYIHuU70Wdqy72GDRnJYp5717+ixhHtiUV/qTOAS0ZagUbfFn71eY6dtxqPlSpj1cgxR4Apyh1o8bN4/BK7K1Fd4KIE'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)
Updated about 2 years ago