Extracting MORE Features from Map Services

Back in August 2015 I posted information about how you can extract features from a map service.  Since then, I have had many contact me about modifying the code so it can extract features beyond the record limit set in the map service.  So today I decided to work on one that does!

To test the script, I headed over to the map services provided by the State of California GIS.  Specifically, the one for wildfires:

http://services.gis.ca.gov/arcgis/rest/services/Environment/Wildfires/MapServer

When you scroll down a quick look will reveal that there is a maximum record count of 1000:

extractpt2-1

At the top, click on the “ArcGIS Online map viewer” link to take a look at the data.  I changed my basemap to Dark Gray Canvas so the symbology really pops.  Here is the area around Santa Clarita:

extractpt2-2

What a mess!  There sure has been a lot of fires in this area.  Clicking on the Show Table icon for the Wildfires layer will reveal how many records (wildfire polygons) I will need to extract:

extractpt2-3

I will extract all 17,550 polygons with the following Python script:

import arcpy
import urllib2
import json

# Setup
arcpy.env.overwriteOutput = True
baseURL = "http://services.gis.ca.gov/arcgis/rest/services/Environment/Wildfires/MapServer/0"
fields = "*"
outdata = "H:/cal_data/data.gdb/testdata"

# Get record extract limit
urlstring = baseURL + "?f=json"
j = urllib2.urlopen(urlstring)
js = json.load(j)
maxrc = int(js["maxRecordCount"])
print "Record extract limit: %s" % maxrc

# Get object ids of features
where = "1=1"
urlstring = baseURL + "/query?where={}&returnIdsOnly=true&f=json".format(where)
j = urllib2.urlopen(urlstring)
js = json.load(j)
idfield = js["objectIdFieldName"]
idlist = js["objectIds"]
idlist.sort()
numrec = len(idlist)
print "Number of target records: %s" % numrec

# Gather features
print "Gathering records..."
fs = dict()
for i in range(0, numrec, maxrc):
  torec = i + (maxrc - 1)
  if torec > numrec:
    torec = numrec - 1
  fromid = idlist[i]
  toid = idlist[torec]
  where = "{} >= {} and {} <= {}".format(idfield, fromid, idfield, toid)
  print "  {}".format(where)
  urlstring = baseURL + "/query?where={}&returnGeometry=true&outFields={}&f=json".format(where,fields)
  fs[i] = arcpy.FeatureSet()
  fs[i].load(urlstring)

# Save features
print "Saving features..."
fslist = []
for key,value in fs.items():
  fslist.append(value)
arcpy.Merge_management(fslist, outdata)
print "Done!"

I put in a few comments so you could see how things are done where.  This script also gives you some feedback as it runs.  I will explain the script.

First, it loads the arcpy module as well as the urllib2 and json modules (we need those to open the url and read in the json output).  In the Setup section, I set my baseURL variable to the wildfire map service layer 0 (you would change this line for the map service layer you are interested in), the fields variable to “*” to grab all the attributes, and the outdata variable to where I want to store my features, in this case in a file geodatabase and a feature class name of testdata (change this line to where you want to save your data, you can also specify a shapefile name instead).

Next, the script extracts the record limit for the map service and saves it to the maxrc variable.  Next, the script grabs the id field name (which in this case is OBJECTID) and all the record id numbers are stored in the idlist variable.  The id list is sorted and then the total number of records value is stored in the numrec variable.

Next, all the feature records are gathered up.  This is done by taking the total number of records and stepping through them in chunks defined by the record limit.  Since this map service has a limit of 1000, the first 1000 records are selected by the id field (OBJECTID) and saved to a temporary featureset fs[i].  It then cycles to the next 1000 records and stores those features, etc., until it reaches that last set.  Note the if statement “if torec > numrec” is a fail safe when we reach the end and our to record number we calculate is higher than the maximum number of records, and if so, the torec value is set to the last record number in the list.  After that, the loop is finished.

Finally, we loop through all the temporary featuresets to put them in a list and then pass that list to merge them all together to create our outdata.  Whew!

Here is what my output looked like when I ran the Python script:

extractpt2-4

Note that as it gathers the features it shows you the queries it uses to do it, in 1000 record chunks.  Also note that sometimes there are missing record numbers, probably because those records with certain OBJECTIDs were either edited and changed numbers or were deleted.  So that is why you see in the first grab OBJECTIDs from 1 to 1003.  It is not grabbing 1003 records, it is just that the first 1000 record numbers in the list start with OBJECTID 1 and end in OBJECTID 1003.  Get it?

The script took a little over 3 minutes to run.  Taking a look using ArcCatalog shows the gathered features:

extractpt2-5

And looking at the table shows I grabbed all 17,550 records and the attributes:

extractpt2-6

I hope this updated script will help you out with those map services that have record limits.  Enjoy!  -mike

29 thoughts on “Extracting MORE Features from Map Services

  1. This is good stuff Mike. How does the python script work with an app? In what way is it incorporated? Do you have any samples?

    Thank you
    Brandon Price

    • Hi Brandon. This python script does use the arcpy module, so it was written for use with ArcGIS. I have a Windows 7 machine, so I just run it at the DOS prompt. You could also just double click the script and it will run too. You could modify the script to not use arcpy at all, you would just need to load other modules that would allow you to save to a shapefile. I’m not sure if this script is app worthy, but I guess you could create one that would allow the user to input a map service URL and output format. I know you could turn this into a geoprocessing tool in ArcGIS with a few modifications. -mike

      • Hey Mike:

        Thanks for the clarification. This is still a big benefit for me for ArcGIS Desktop. I was just wondering because I have an app that I might want to add a download widget to later on down the line.

        Brandon

  2. Thank you so much for sharing this! We are moving all of our services to AGOL, but have needs for local geoprocessing. This is just what I needed. I had to do some tweaks to get it to run using ArcGIS Pro’s Python 3 installation. I’d be happy to share those modifications if you’d like.

  3. Hi Micheal,
    Thank you so much for all your efforts on this code, super cool! Unfortunately I am running into an issue right at the end of the script, I receive the following error: “Runtime error Traceback (most recent call last): File “”, line 49, in File “c:\program files (x86)\arcgis\desktop10.3\arcpy\arcpy\management.py”, line 4217, in Merge raise e ExecuteError: ERROR 999998: Unexpected Error.”

    Line 49 in the code is “arcpy.Merge_management(fslist, outdata)”
    And line 4217 in my management.py file is “raise e” (I can provide the previous lines if that’s helpful to you).
    I know it’s likely impossible to troubleshoot all errors one could expect but I’m not sure if something jumps out at you. I’m not python expert so I appreciate any help you can offer.
    I should also note, I copied and pasted your code exactly as you have it, within the python window in ArcGIS 10.3; only changing the path to the map service and where I desired to save my output data.
    Thank you again!
    ~D

    • Hi Danielle. Either there is an anomaly in the map service data, or maybe the name you use for the outdata variable is not valid by ArcGIS standards. For example, does you data name have spaces in it? Sometimes that is a problem. Is it going to a .SHP file? If so, is it hitting some limits with the number of features or fields? Send me the map service path you are using and I can give it a try as long as I have access to it. -mike

      • Another thing that comes to mind is your computer. Maybe the environment on your computer is causing a problem. Does it have enough memory for all those temporary feature datasets before the merge? Is your hard drive fast enough? Have you tried using another computer to see if the code works? Have you tried upgrading ArcGIS Desktop to see if the problem goes away? I am using ArcGIS Desktop 10.5.1. Just some thoughts. -mike

      • Mike,
        Thank you much! This is the service I’m using: https://gis.nassaucountyny.gov/arcgis/rest/services/ncgis/MapServer/13

        I’m fairly certain it’s not locked down as I’m able to run a single query and download batches of a 1000 but there any many records (407,888) so doing this action 400+ times is prohibitive and makes your script all the more appealing 🙂

        I’ve been sure to exclude any spaces in my naming conventions. I’ve attempted another, smaller map service as well but with the same result. Additionally I also attempted using your map service and simply changed the output path (again no spaces) to a geodatabase of the same name; still no luck there.

        I also used another PC, as you suggest below, same result, but it’s also on the same computer network with the same ArcGIS build specs.
        As far as my PC; I have a 64-bit OS with 8GB of RAM (7.88GB usable) and a 3.40GHz processor.

        ~D

      • Ok, so I think I know the problem … too much data! With 407,888 records, that means you will need enough memory to store it all, then basically merge all 408 temporary feature datasets into one. I don’t think you have enough memory OR merge cannot handle it all. I am going to try on my end with a hefty computer and see what happens to me!

      • Well it worked for me, go figure! I had 12gb of memory, but noticed it only consumed about half a gig when the script was almost done. I am using ArcGIS 10.5.1, so maybe that helped me? I saved the parcels to a geodatabase and it is only 88mb in size. Anyway, if you want the data I downloaded, feel free to email me at mcarson@burbankca.gov and I will send you a link.

      • Well imagine that! I appreciate you giving it a whirl! I definitely think it’s the Desktop versioning since I’ve tried map services with far fewer records and had the same result.
        I’ll download the latest version and report back just in case others have the same issue.
        Thank you for all of your efforts Mike!

  4. I successfully updated my ArcGIS Desktop Version to 10.6 and still no luck with running the script. Wanted to let folks know if they run into the same error I posted above.

    • Hi Danielle. I was thinking about your issue with the script. I still think it is a memory issue for you. Do you run the script inside ArcMap using the Python window? If so, try running the python script at the Windows DOS prompt instead (Windows cmd window).

      Another option is the script below. It will save each feature extract instead of using memory, then it merges all the features together. Just modify the outdata variable and see if that works. Do note that it will leave a bunch of temporary feature classes in the geodatabase, each representing the 1000 feature extract. Give it a try.

      import arcpy
      import urllib2
      import json

      # Setup
      arcpy.env.overwriteOutput = True
      baseURL = “https://gis.nassaucountyny.gov/arcgis/rest/services/ncgis/MapServer/13”
      fields = “*”
      outdata = “C:/temp/data.gdb/ncgis_parcels”

      # Get record extract limit
      urlstring = baseURL + “?f=json”
      j = urllib2.urlopen(urlstring)
      js = json.load(j)
      maxrc = int(js[“maxRecordCount”])
      print “Record extract limit: %s” % maxrc

      # Get object ids of features
      where = “1=1”
      urlstring = baseURL + “/query?where={}&returnIdsOnly=true&f=json”.format(where)
      j = urllib2.urlopen(urlstring)
      js = json.load(j)
      idfield = js[“objectIdFieldName”]
      idlist = js[“objectIds”]
      idlist.sort()
      numrec = len(idlist)
      print “Number of target records: %s” % numrec

      # Gather features
      print “Gathering records…”
      merge_list = []
      for i in range(0, numrec, maxrc):
        torec = i + (maxrc – 1)
        if torec > numrec:
          torec = numrec – 1
        fromid = idlist[i]
        toid = idlist[torec]
        where = “{} >= {} and {} <= {}".format(idfield, fromid, idfield, toid)
        print " {}".format(where)
        urlstring = baseURL + "/query?where={}&returnGeometry=true&outFields={}&f=json".format(where,fields)
        fs = arcpy.FeatureSet()
        fs.load(urlstring)
        tempdata = outdata + str(i)
        print " Copying features…"
        arcpy.CopyFeatures_management(fs, tempdata)
        merge_list.append(tempdata)

      # Save features
      print "Saving features…"
      arcpy.Merge_management(merge_list, outdata)
      print "Done!"

  5. We were having the same issue. Have you tried turning off the background processing in Geoprocessing Tab>Geoprocessing Options and then uncheck enable background processing.

    Oh and thanks Mike that script helped a ton!

    • Thanks Bryan. I have background geoprocessing on and it works for me, I tested it. However, I do not run the script inside ArcMap using the Python window. I run it at the Windows DOS prompt. If Python is setup right in Windows, you can even double click the .py script and it will run too.

      • Mike, thanks for the additional info. I will definitely keep that in mind. My comment above was more in reference to possibly help Danielle out. I was not detailed enough in my previous response.

        I was getting the same error Danielle had running the script through the python window in ArcCatalog which was the “Runtime error Traceback (most recent call last): File “”, line 49…..: ERROR 999998: Unexpected Error.”” But as soon as I disabled background processing (Geoprocessing Tab>Geoprocessing Options and then uncheck enable background processing) the script you provided worked exactly how it is supposed to.

        Again, thank you Mike for sharing the script with us.

    • Hi Kegan. It is not too difficult, you just need to authenticate yourself with ArcGIS Server and get a token, then you use that token when you make HTTPS requests to secure map and feature services.

      For example, to manually do this if an ArcGIS Server is setup for tokens (see http://maps.scag.ca.gov/scaggis/rest/services) you will see in the upper right corner a “Get Token” link. If you click on that, you will be taken to the tokens page (https://maps.scag.ca.gov/scaggis/tokens/) to enter a username and password to get a token from the server. Note you can specify the Client (I use RequestIP to limit my token to my current IP address), Expiration (I usually keep it to an hour), and Format (I use JSON). Once you get the token, you have access to the secured services at that site.

      So how to do this in Python? First at the top of your code where you import the arcpy module, you need to import the requests module.

      import requests

      Then the code block to get the token looks like this:

      tokenurl = "https://YOUR_ARCGIS_SERVER/tokens/"
      username = "YOUR_USERNAME"
      password = "YOUR_PASSWORD"
      params = {'username':username, 'password':password, 'client':'requestip', 'expiration':60, 'f':'json'}
      tokenresponse = requests.post(tokenurl, data=params, verify=False)
      data = tokenresponse.json()
      token = str(data["token"])

      Note for the params I set the client to “requestip”, the expiration to 60 (in minutes, so 1 hour), and my output format to “json”.

      With the token now set in the token variable, you use that variable when you make requests to secure map and feature services. You do this by adding “&token=” to the calling URL with the token value after the equal sign. For example, using the code sample in this post, I would change the urlstring variable and add the token parameter:

      urlstring = baseURL + "?f=json&token={}".format(token)

      and

      urlstring = baseURL + "/query?where={}&returnIdsOnly=true&f=json&token={}".format(where,token)

      and also

      urlstring = baseURL + "/query?where={}&returnGeometry=true&outFields={}&f=json&token={}".format(where,fields,token)

      That’s it!

      • Hi Michael

        Thank you so much for the advice! I had to use Webapp means of authentification to generate a token (the requestIP generated token did not seem to work for some reason). I eventually got the process to work since my dataset had about 14 million verticies it kept giving me the following error:

        “RuntimeError: RecordSetObject: Cannot open table for Load”,

        Turns out i had to reduce my MaxRecordCount to increments of 500 and it worked!

        Keep up the good work
        Kegan

  6. I keep getting an error when I run this code in ArcGIS Pro:
    import arcpy
    import urllib2
    import json

    # Setup
    arcpy.env.overwriteOutput = True
    baseURL = “https://gis2.idaho.gov/arcgis/rest/services/DPR/Idaho_Trails_MapC/MapServer/4/query”
    fields = “NAME,OFFICE,HUNTING_UNIT”
    outdata = “Q:\Temp\Mark_Temp\Idaho_Project\Idaho Trails Map\Trailheads.gdb/Automobiles2”

    # Get record extract limit
    urlstring = baseURL + “?f=json”
    j = urllib2.urlopen(urlstring)
    js = json.load(j)
    maxrc = int(js[“maxRecordCount”])
    print “Record extract limit: %s” % maxrc

    # Get object ids of features
    where = “1=1”
    urlstring = baseURL + “/query?where={}&returnIdsOnly=true&f=json”.format(where)
    j = urllib2.urlopen(urlstring)
    js = json.load(j)
    idfield = js[“objectIdFieldName”]
    idlist = js[“objectIds”]
    idlist.sort()
    numrec = len(idlist)
    print “Number of target records: %s” % numrec

    # Gather features
    print “Gathering records…”
    fs = dict()
    for i in range(0, numrec, maxrc):
    torec = i + (maxrc – 1)
    if torec > numrec:
    torec = numrec – 1
    fromid = idlist[i]
    toid = idlist[torec]
    where = “{} >= {} and {} <= {}".format(idfield, fromid, idfield, toid)
    print " {}".format(where)
    urlstring = baseURL + "/query?where={}&returnGeometry=true&outFields={}&f=json".format(where,fields)
    fs[i] = arcpy.FeatureSet()
    fs[i].load(urlstring)

    # Save features
    print "Saving features…"
    fslist = []
    for key,value in fs.items():
    fslist.append(value)
    arcpy.Merge_management(fslist, outdata)
    print "Done!"

    I get a error message:
    File "”, line 16
    print “Record extract limit: %s” % maxrc
    ^
    SyntaxError: Missing parentheses in call to ‘print’. Did you mean print(“Record extract limit: %s” % maxrc)?

    I’ll change these out but will still give me a syntax error.

    When I run the same code in ArcMap 10.3.1 I recieve this error:
    Runtime error
    Traceback (most recent call last):
    File “”, line 15, in
    KeyError: ‘maxRecordCount’

    I’m not sure what I’m doing wrong, any advice?

    • Hi Mark. The code was made to run in Python that comes with ArcGIS Desktop (ArcMap), not ArcGIS Pro. ArcGIS Pro uses Python 3.6.x while ArcGIS Desktop uses 2.7.x, so the code would need to be modified to work in Python 3 and ArcGIS Pro. As for the error you are getting using ArcMap, I would have to give it a try and see what is going on. -mike

    • Mark, remove the “/query” from your baseURL variable. It is messing up the code when it tries to query the map service to get the max record count. If you want to run in ArcGIS Pro, you will need to modify all the print statements to the new print function print( ). -mike

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s