Recreating Python's Slice Syntax in JavaScript Using ES6 Proxies

By Evan Sangaline | June 28, 2018

I’ve noticed that JavaScript proxies seem to have been getting an increasing amount of attention recently. They were introduced by ECMAScript 2015 (ES6) several years ago, but they remain one of the less well-known features of the language. That’s a real shame because proxies are pretty awesome. They give you a level of flexibility that simply didn’t exist previously in JavaScript, and have allowed for projects like Remote Browser to become possible.

In a nutshell, proxies allow you to wrap objects in order to intercept and modify the behavior of certain actions–stuff like accessing properties and making function calls. That’s pretty abstract, so let’s take a look at a concrete example. Say that you were tired of writing array[array.length - n] all of the time, and that you wanted to be able to simply write array[-n] instead. JavaScript unfortunately doesn’t support this syntax natively. To implement it, you need to get creative.

You could have almost, sort of, kind of accomplished this using old-fashioned JavaScript. And by “old-fashioned,” I mean specifically ECMAScript 5.1 which was released in 2011. It added support for getters and setters which allow you to define methods that control how specific properties are accessed. Using Object.defineProperty(), we can attach accessors to an object that (mostly) recreate our negative array indexing access.

function wrapArray(array) {
  var wrappedArray = {};
  for (var i = 0; i < array.length; i++) {
    (function(i) {
      // Normal array indexing: `array[0]`, `array[1]`, etc.
      Object.defineProperty(wrappedArray, i.toString(), {
        get: function() {
          return array[i];
        },
        set: function(value) {
          array[i] = value;
        },
      });
      // Fancy negative slice indexing to count back from the end.
      Object.defineProperty(wrappedArray, '-' + i.toString(), {
        get: function() {
          return array[array.length - i];
        },
        set: function(value) {
          array[array.length - i] = value;
        },
      });
    })(i);
  }
  return wrappedArray;
}

The wrappedArray object that’s returned by this method has explicitly defined setters and getters for each integer from -array.length to array.length. After wrapping an array, things pretty much behave as we would expect.

// Wrap an array of 5 elements.
var array = wrapArray([0, 1, 2, 3, 4]);

// Outputs: 1
console.log(array[1]);

// Outputs: 4
console.log(array[-1]);

// Outputs: 'three'
array[-2] = 'three';
console.log(array[3]);

What if we wanted to call wrappedArray.forEach() or wrappedArray.map() though? These properties don’t exist on the wrapped array object; we would need to explicitly loop through all of the properties on the underlying array and attach additional getters and setters for each of them. After we did this, there would also be wrappedArray.push() and wrappedArray.unshift() to deal with. These allow the underlying array to grow in size, but our wrappedArray object only provides accessors that cover the original array length. That means that our getters for these methods would need to return wrapped versions of the methods that also handle attaching additional indexing accessors after each mutation. And even if we did that, any out of bound array access would simply return undefined instead of raising an exception. This is getting complicated fast.

The fundamental shift from using setters and getters to using proxies with get() and set() handlers is that they can capture arbitrary property access. The vast majority of the complexity relating to our wrapArray() method stems from the fact that we need a getter and setter for every single property that we intend to use. Proxies eliminate that necessity, and they really open the flood gates in terms of what sort of APIs you can possibly create in JavaScript.

They say that with all great power comes great responsibility, and proxies are no exception. Abusing the power of proxies can lead to unexpected–actually, you know what? Let’s skip the disclaimer. Proxies are fun, and abusing them can be fun too. Sometimes that’s all you need.

This article’s for anyone who has ever, out of genuine curiosity, asked a question on StackOverfow about whether something is possible, only to have their question closed because: “What are you actually trying to accomplish?”. We’re going to recreate Python’s awesome slice syntax for array access in JavaScript using proxies. In the process, we’ll learn about how slicing in Python works under the hood, and we’ll also learn a lot about JavaScript proxies.

As always, all of the relevant code from this article is available in the intoli-article-materials reposistory on GitHub. You can head over there to skip ahead and see where we’re going to end up, or you can just star the repo because you love our blog and you want to find out about upcoming articles before they’re published. We also polished the code up and published it as an npm package called slice. That package includes better handling of edge cases, a robust test suite, separate SliceArray and SliceString classes, and a bonus implementation of Python’s range() method.

Intoli Smart Proxies

If you're doing serious web scraping, then using proxies is a must. Intoli's smart proxies help you get the data you need in the same way that other residential proxies do, but they also offer a whole lot more. Your requests are intelligently routed using advanced machine learning techniques, failed requests are retried automatically, and you can even configure a proxy to automatically render requests in a headless web browser that's preconfigured to avoid detection. Enter your email address below to get access to the same tooling that we use for all of our own web scraping!

How Slicing Works in Python

Alright, time to get down to business. Let’s start by investigating how slicing works in Python. This will help set the stage for mimicking the behavior in JavaScript, both in terms of the syntax and the implementation.

Coincidentally, we’ll begin by doing something in Python that’s pretty similar to what proxies do in JavaScript. We’ll define a simple SliceProbe class that has a single method: __getitem__(). Methods that are wrapped in double underscores like this are often called “magic” methods in Python. When you try to do something like add a class instance to another object, or call it like a function, Python will check for the corresponding magic method to figure out what should happen next. The __getitem__() magic method is the one that corresponds to square bracket access, like you would use for list or string indexing.

class SliceProbe:
    """Simple class that overrides `[]` access to return the key."""
    def __getitem__(self, key):
        return key

# Create an instance of the class to use for probing.
probe = SliceProbe()

Our implementation of __getitem__() here just returns whatever the key was. This will allow us to probe how Python behaves in order to see how slicing works. Let’s start with some basic integer indexing to see what that looks like.

# Outputs: 1
print(probe[1])

# Outputs: -2
print(probe[-2])

OK, that seems pretty close to what we might expect. We put these integers in square brackets, and they got passed as-is into the __getitem__() method. However, this is notably not what would happen in JavaScript. JavaScript coerces nearly all property names to strings when they’re accessed, even the integer indices used with arrays. We’ll revisit this later on when we’re building our proxy in JavaScript.

Negative indexing is great and all, but Python’s slice syntax extends far beyond that. For starters, you can use a colon in combination with one or two indices in order to select a particular range of a sequence. The behavior of this is actually pretty much identical to Array.slice() in JavaScript with the exception that JavaScript doesn’t support assignment to slices. Here are a few quick examples for reference.

  • array[:stop] - The first stop characters of an array, equivalent to array.slice(undefined, stop) in JavaScript.
  • array[start:stop] - Gives you the start - stop characters from start through stop, equivalent to array.slice(start, stop).
  • array[start:] - Gives you the characters from start through the end, equivalent to array.slice(start, undefined).

Let’s try a few of these out with our SliceProbe.

# Outputs: slice(None, 1, None)
print(probe[:1])

# Outputs: slice(1, None, None)
print(probe[1:])

# Outputs: slice(1, 2, None)
print(probe[1:2])

# Outputs: slice(1, -2, None)
print(probe[1:-2])

This is where things start getting interesting! The shorthand syntax for slicing gets translated into an instance of the built-in slice class when it’s used inside of square brackets. A slice object has three main properties that classes can use to implement slicing behavior in __getitem__() and __setitem__(): start, stop, and step. When we print out a slice object, it’s printed out as slice(start, stop, step). The start and stop ones should be fairly self-explanatory–and they’re what we populated above–but what about step?

The step parameter is part of what is called the extended slice syntax in Python, and the syntactic sugar for specifying it is to include a second colon followed by a number in the slice. Here are a few quick examples of how it can be populated using the shorthand.

# Outputs: slice(None, None, 2)
print(probe[::2])

# Outputs: slice(1, None, -4)
print(probe[1::-4])

# Outputs: slice(1, 2, 3)
print(probe[1:2:3])

What this step parameter actually does is allow you to access every Nth element in a sequence. For example, array[::2] gives you the elements with even indices and array[1::2] gives you the elements with odd ones. A negative value for step works the same way except that each successive index decreases by step instead of increasing by it. Specifying only the step can be used to reverse an array with array[::-1], or it can be combined with start and stop to construct far more complex slices.

Before we move on to JavaScript, let’s have a little fun with slices by solving good-old Fizz Buzz. The rules of Fizz Buzz are simple: print out the numbers from one through 100 with every number divisible by three replaced with “Fizz,” every number divisible by five replaced with “Buzz,” and every number divisible by both three and five with “Fizz Buzz.” Using Python’s plethora of syntactic sugar, we can win the game without the use of recursion or iteration.

# Populate a list from 1 through 100.
outputs = list(range(1, 100 + 1))

# Replace every 3rd element with 'Fizz'.
outputs[(3 - 1)::3] = (100 // 3) * ['Fizz']
# Replace every 5th element with 'Buzz'.
outputs[(5 - 1)::5] = (100 // 5) * ['Buzz']
# Replace every (3 * 5)th element with 'Fizz Buzz'.
outputs[((3 * 5) - 1)::(3 * 5)] = (100 // (3 * 5)) * ['Fizz Buzz']

# Congrats on your new job! Please report to HR for orientation.
print(outputs)

That’s the clarity of syntax and intention that we’re going to bring to JavaScript!

Implementing Slicing Without Proxies

Proxies will help us to create a nice syntax for slicing, but we’ll also need to implement the slicing behavior in the first place. Let’s start by taking a page out of Python’s book; we’ll create a Slice class capable of storing the start, stop, and step variables.

class Slice {
  constructor(start, stop, step) {
    // Support the `Slice(stop)` signature.
    if (stop === undefined && step === undefined) {
      [start, stop] = [stop, start];
    }

    // Support numerical strings.
    this.start = start == null ? start : parseInt(start, 10);
    this.stop = stop == null ? stop : parseInt(stop, 10);
    this.step = step == null ? step : parseInt(step, 10);
  }
}

We’re mostly just cleaning the variables up a bit here and storing their values in the class. We also explicitly support Slice(stop) and Slice(start, stop, [step]) constructor signatures. That’s not particularly important, but it’s done for consistency with the Python slice class.

This next part is going to get a little more complicated. We’re going to add a Slice.indices() method that takes in an array and returns the indices which the slice corresponds to. The logic for this follows a few distinct steps, again chosen to mirror Python’s slicing behavior.

  1. Support negative indexing by adding the length of the array to start/stop if they’re negative.
  2. Set the default step to one.
  3. Set the default values of start and stop depending on whether we’re stepping forward or backwards through an array.
  4. Start at the start index, repeatedly add step until we pass stop or go out of bounds, and accumulate each index as we go.
  5. Return the accumulated array of indices.

Turning that basic logic into code gives us the following method which we’ll add inside of our Slice class definition.

  indices(array) {
    // Handle negative indices while preserving `null` values.
    const start = this.start < 0 ? this.start + array.length : this.start;
    const stop = this.stop < 0 ? this.stop + array.length : this.stop;

    // Set the default step to `1`.
    const step = this.step == null ? 1 : this.step;
    if (step === 0) {
      throw new Error('slice step cannot be zero');
    }

    // Find the starting index, and construct a check for if an index should be included.
    let currentIndex;
    let indexIsValid;
    if (step > 0) {
      currentIndex = start == null ? 0 : Math.max(start, 0);
      const maximumPossibleIndex = stop == null ? array.length - 1 : stop - 1;
      indexIsValid = (index) => index <= maximumPossibleIndex;
    } else {
      currentIndex = start == null ? array.length - 1 : Math.min(start, array.length - 1);
      const minimumPossibleIndex = stop == null ? 0 : stop + 1;
      indexIsValid = (index) => index >= minimumPossibleIndex;
    }

    // Loop through and add indices until we've completed the loop.
    const indices = [];
    while (indexIsValid(currentIndex)) {
      if (currentIndex >= 0 && currentIndex < array.length) {
        indices.push(currentIndex);
      }
      currentIndex += step;
    }

    return indices;
  };

Once we know how to compute the indices that a slice corresponds to, it’s pretty trivial to retrieve the values from the array instead. We can add a get() method to Slice to accomplish this by mapping each index to its corresponding value with Array.map().

  get(array) {
    return this.indices(array).map(index => array[index]);
  }

Finally, we can follow a similar pattern for assignment by defining a set() method that takes the array as its first argument and an array of values to replace the slice with as its second. We’ll check that both the slice and the values have the same length to be safe, but this primarily requires only another simple loop over the generated indices.

  set(array, values) {
    // Require the lengths of the slice and `values` to match.
    const indices = this.indices(array);
    if (indices.length !== values.length) {
      throw new Error(
        `attempt to assign sequence of size ${values.length} ` +
        `to extended slice of size ${indices.length}`
      );
    }
    // Loop through and set each value in the array.
    this.indices(array)
      .forEach((arrayIndex, valuesIndex) => {
          array[arrayIndex] = values[valuesIndex];
      });
    return true;
  }

With that last piece in place, we can now do pretty much anything that we can do using Python’s extended slicing.

const array = [1, 2, 3, 4, 5, 6];


// Outputs: [1, 2, 3]
console.log((new Slice(3)).get(array));

// Outputs: [6, 4, 2]
console.log((new Slice(null, null, -2)).get(array));

// Outputs: [1, 'two', 3, 'four', 5, 'six']
(new Slice(null, null, -2)).set(array, ['six', 'four', 'two']);
console.log(array);

The functionality here is pretty neat, but the interface is atrocious.

Making Everything Better With Proxies

Now we can get to the fun part: using proxies to wrap some nice syntactic sugar around the slicing functionality that we just implemented. Let’s start by defining a SliceArray class, and seeing how we integrate a proxy with it. You can ignore that the constructTrap() method is undefined for now. We’ll come back to that once we understand the rest of what’s going on here.

class SliceArray extends Array {
  constructor(...args) {
    super(...args);

    return new Proxy(this, {
      get: constructTrap('get'),
      set: constructTrap('set'),
    });
  }
}

The first thing to notice is that our class is extending the built-in Array type, and passing all of its constructor arguments to the Array constructor using a call to super(). If we just stopped here, then our class would basically be a subtyped array. What we do instead is return a newly instantiated proxy–that’s where the magic happens.

One might reasonably assume that evaluating something like const array = new SliceArray(); would result in array containing an instance of SliceArray. Turns out, that that’s not necessarily the case. By returning an explicit value from our constructor, we’re able to override the value that gets returned when the class is instantiated. Relating this back to Python for a minute, this is roughly equivalent to implementing the __new__() magic method, while a JavaScript constructor without a return value would be more akin to __init__().

When we used __getitem__() in conjunction with our Python SliceProbe class earlier, we defined the method directly on the class. Proxies work a little bit differently. A proxy instance is a separate object that wraps around a target object and forwards operations to the target. You can optionally specify “traps” on a proxy which intercept operations that would otherwise be forwarded to the target unimpeded. Returning a proxy from our class’ constructor is a trick to make it seem like we’ve modified the behavior of property access on SliceArray. The difference is a bit subtle from an end user’s perspective, but it’s a massive and fundamental difference between how proxies work in JavaScript compared to actual operator overloading in other languages.

There are thirteen different operations that proxies are able to intercept. These include things like using the in/new/delete operators, function calls, and property access. The two that we’ll be using are handler.get() and handler.set(). These allow us to intercept what happens when somebody tries to interact with our proxy using square brackets. Assignment will be intercepted by the handler.set() trap, and read access will be intercepted by the handler.get() trap.

The proxy constructor itself has a signature of Proxy(target, handler). The target parameter specifies the object that’s being wrapped. We used this as the target when constructing the proxy in SliceArray.constructor() which refers to the actual SliceArray instance that never actually gets returned to the user. The handler parameter is simply a map of trap names to the methods that handle them. We specified traps for get and set which correspond to whatever methods constructTrap('get') and constructTrap('set') return.

We’ll get to the actual traps in just a minute, but we have to address something real quickly first: it’s not possible to recreate Python’s slice syntax in JavaScript. Well, not exactly at least. Our proxy is hooking into how property access is handled for an object. By the time something is considered property access in JavaScript, it’s already coerced the property name into a string. There’s no difference between array[1] and array['1'] in JavaScript, and proxies don’t change that.

This means that proxies can’t make it possible to do something like array[::-1] in JavaScript. The use of the colons produces a syntax error long before the proxy traps would be relevant. What we can do is to make array['::-1'] work… but that’s a little unsatisfying. As soon as we want to use variables, we would need to start doing things like array[`${start}:${stop}:${step}`]. That’s not syntactic sugar… it’s syntactic aspartame at best.

My proposed solution here is to try to capture the spirit of Python’s slice syntax rather than the exact syntax itself. We can accomplish this using an alternative syntax where we use double square brackets and commas in place of colons: array[::-1] => array[[,,-1]], array[:n] => array[[,n]], array[-5::2] => array[[-5,,2]], and so on. This might not be quite as concise as Python’s syntax, but it only requires two extra keystrokes per slice and I think it’s as close as we’re gonna get.

The way that this double bracket syntax works is that the inner brackets actually construct an array, Array.toString() is called to convert the array into a property name, and then this key is passed to our handler traps. The array-to-string conversion just joins the values of each array member together separated by commas, and the empty array positions are left blank. This is basically a poor man’s string interpolation. The results are identical to what the array[`${start}:${stop}:${step}`] approach would produce–only with commas in place of colons.

Now that we have that out of the way, let’s take a look at the actual constructTrap() method that we used in the SliceArray constructor to construct the handler.get and handler.set traps.

// Helper method that constructs either a `get` or `set` trap.
const constructTrap = action => (target, name, value) => {
  const key = (name || '').toString()
    .replace(/\s/g, '')  // Remove all whitespace.
    .replace(/,/g, ':')  // Replace commas with colons.

  // Handle negative indices.
  if (/^-\d+$/.test(key)) {
    const newKey = target.length + parseInt(key, 10);
    return Reflect[action](target, newKey, value);
  }

  // Handle slices.
  if (/^(-?\d+)?(:(-?\d+)?(:(-?\d+)?)?)$/.test(key)) {
    const [start, stop, step] = key.split(':').map(part => part.length ? part : undefined);
    const slice = new Slice(start, stop, step);
    return slice[action](target, value);
  }

  // Fall back to the array's own properties.
  return Reflect[action](target, name, value);
};

We can see here that the constructTrap() takes one argument called action which corresponds to either “get” or “set.” It then returns a newly constructed trap/method with a signature of (target, name, value). We’re actually cheating a little here because the handler.set trap should have a signature of (target, name, value), but the handler.get trap’s signature should really be (target, name).

In both cases, the target parameter corresponds to the array that the proxy is wrapping, while name corresponds to the name of the property that’s being accessed. The value parameter is the value that’s being assigned for handler.set, but it doesn’t make any sense to have a value during handler.get access. It will just be undefined in that case though; it won’t hurt anything, and this pattern allows us to use the same logic for both traps.

The first thing that the constructed trap itself does is clean up the property name a bit by removing white space and replacing commas with colons to accommodate the double bracket syntax. The next thing after that is to use a regular expression to check for negative numbers. If the property name does correspond to a negative number, then we’ll want to pass on the property access to the underlying array after adding the length of the array to the negative array index.

One way that we could accomplish this would be to write something like this.

  // Handle negative indices.
  if (/^-\d+$/.test(key)) {
    const newKey = target.length + parseInt(key, 10);
    if (action === 'get') {
        return target[newKey];
    } else if (action === 'set') {
        target[newKey] = value;
    }
  }

That’s roughly equivalent to the actual implementation, but we chose to use Reflect.get and Reflect.set instead even though they accomplish the same thing. If you’re not familiar with Reflect: it’s basically the yin to Proxy class’ yang. For every proxy handler trap there’s a corresponding method on Reflect with the exact same name, method signature, and meaning. While proxies allow you to intercept operations, the reflection interface allows you to pass them on to another object.

In this case, we added the array’s length to the key, and then passed on the operation to the underlying array using the new key. The code itself is completely agnostic to whether action corresponds to set or get. In fact, this same code would also work for handler.get and handler.deleteProperty. The Reflect abstraction is incredibly useful because it allows us to write proxy traps using a consistent and concise syntax.

As a brief aside: when Reflect was first added to JavaScript, there were a number of developers who interpreted the interface as the cool new way to interact with objects in general. I know of one library with over fifteen thousand stars on GitHub where they use Reflect with string literals for access like Reflect.get(myObject, 'myParameter') instead of myObject.myParameter. There was even an ESlint rule called prefer-reflect that enforced this usage for certain actions. The deprecation notice from the documentation is mildly amusing.

This rule was deprecated in ESLint v3.9.0 and will not be replaced. The original intent of this rule now seems misguided as we have come to understand that Reflect methods are not actually intended to replace the Object counterparts the rule suggests, but rather exist as low-level primitives to be used with proxies in order to replicate the default behavior of various previously existing functionality.

Oh JavaScript community, never change. Never change.

Anyway… after making sure that the key doesn’t correspond to a negative index, we do a second check with a regular expression to see if it corresponds to a slice. When the key does match the slice definition syntax, we parse the start, stop, and step variables from that, and use them to construct an instance of our Slice class. Then we just need to call either Slice.get() or Slice.set() depending on what the value of action is. You can see now how our Slice method names and signatures were chosen to match up with the proxy trap signatures in much the same way as the reflection interface.

Finally, we’re left with keys that don’t match the negative index syntax or the slice syntax. We can just pass along the trap’s arguments directly to the corresponding reflection method for anything else. The operation will be processed exactly as it would be if there were no proxy involved, and we pretty much don’t have to worry about it.

The final return statement handles literally every other property name, but I’ll explicitly point out that that includes the keys of methods like Array.push() and Array.shift() which mutate the underlying array. All of that extra stuff that we would have had to worry about with Object.defineProperty()? We don’t have to worry about it with proxies. We can just wedge ourselves in there between the proxy and the underlying object, intercept and handle the specific operations that we care about, and pass everything else on to the underlying object.

Conclusion

What is this, a crossover episode?

Thanks for reading! This was especially fun to write because we got to take a look at how two different languages work, and carry over an awesome part of one into another. I’m honestly a pretty big fan of the double bracket slice syntax; it’s almost surprising how close to Python’s own extended slice syntax we were able to get. Sure, we had to abuse proxies a little bit to get there, but that’s what they’re there for, right? If you’re a fan of the syntax too, we’ve made it available as an npm package called slice that you can use in your own projects.

For anyone whose interest in proxies has been piqued: I also highly recommend taking a look at our Remote Browser project. When it comes to using proxies to create really powerful interfaces, Remote Browser blows what we did here out of the water. And while you’re on GitHub anyway, feel free to star our intoli-article-materials repository! We put up supplementary materials for most of articles there, and starring it is a great way to find out about new content.

Suggested Articles

If you enjoyed this article, then you might also enjoy these related ones.

Performing Efficient Broad Crawls with the AOPIC Algorithm

By Andre Perunicic
on September 16, 2018

Learn how to estimate page importance and allocate bandwidth during a broad crawl.

Read more

User-Agents — Generating random user agents using Google Analytics and CircleCI

By Evan Sangaline
on August 30, 2018

A free dataset and JavaScript library for generating random user agents that are always current.

Read more

No API Is the Best API — The elegant power of Power Assert

By Evan Sangaline
on July 24, 2018

A look at what makes power-assert our favorite JavaScript assertion library, and an interview with the project's author.

Read more

Comments