What is an index signature in a programming language? – Personal page of chai2010 – News Fast Delivery

1. Background

Recently, I have participated in the development of the KCL configuration language compiler, the built-in domain language of KusionStack. The grammar of the language includes a concept of “index signature”. When participating in community discussions, I found that many small partners do not understand what this “index signature” is. , so I thought about it for a while, and found that I only knew what it looked like, but I didn’t know the complete definition of “index signature”, so I decided to write a post to sort out what “index signature” is.

2. Know the meaning of the name

First, the idea of index signatures is not mysterious or new. Similar programming conventions should have been seen in early Windows development:

bool(BOOL) bIsParent starts with b
byte(BYTE) starts with byFlag
short(int) starts nStepCount with n
long(LONG) starts with lSum
char(CHAR) start cCount with c

Knowing the type of variables and class members just by looking at their names improves readability in terms of code types.But this convention has not settled into the C++ language, if the language can support the definitionMembers starting with b are of type BOOLThis feature is amazing – this is actuallyindex signaturesimple purpose.

In the literal sense, “index signature” includes two parts, “index” and “signature”.

2.1 Index (index)

From a developer’s point of view, an index, similar to a pointer in C, acts like an arrow to point to a specific thing, which we may not be able to access directly for some reason, or mixed with other things Together, direct access to this thing may take a long time to find among many other things. Therefore, we use the index to point to this thing, which can be understood as we tie a thread to the thing we need, and keep a thread around us, whenever we want to use this thing, we don’t need to start from another To find it in the pile of things, just pick up the thread and follow the thread to find this particular thing, this thread is the “index”, and obviously, a thread does not allow forking to bind two. This is a thing, so usually the “index” of one thing will not point to another thing by default.

Therefore, in the process of development, the main usage scenario of “index” is “find a specific thing in a bunch of things”. For example: the most common data structure – array, is a good example of “index”. In an array, the index is an integer number, which is the position information of each element in the array. Through the position, an array element can be quickly located.

int a[3] = [1, 2, 3]；

// 使用索引0，就可以在1,2,3三个数字中，快速的找到排在最前面的元素。
assert a[0] = 1;
assert a[1] = 2;
assert a[3] = 3;

In addition to arrays, another data structure that uses indexes is our common Hash table, but in some programming languages, the index in the hash table is called a key.

hashtable<string, string> a;

// "Jack" 就可以视作一个索引,通过名字字符串作为索引，
// 在不考虑重名的情况下，它指向了一个结构实例 Person("Jack")。
a.put("Jack", new Person("Jack"))
a.put("Tom", new Person("Tom"))
a.put("Lee", new Person("Lee"))

To give another example, the structure struct or class class that exists in many programming languages also uses the idea of indexing.

// 可以看做是String和Integer的集合
// 如果没有索引，我们就只知道Person内部有两个属性，
// 一个类型为String表示名字，
// 一个为Integer表示年龄。
struct Person{
    name: String,
    age: Integer,
}

Person p = new Person(name: "Jack", age: 10);

// 通过索引name我们能够轻松的获取到Person的名字Jack。
assert p.name == "Jack"

// 通过索引age我们能够轻松的获取到Person的年龄10。
assert p.age == 10

To sum up, the index can be regarded as a pointer without specific format constraints, as long as it can uniquely point to one thing, it cannot have ambiguity, that is, it cannot point to A and B at the same time. Or an index can also be seen as a method that takes an index as a parameter and returns the thing pointed to by the index.

Note: This concept does not include some special cases, such as some application scenarios that need to point to both A and B indexes are also possible, and most of the general cases are discussed here.

2.2 Signature

In the field of programming languages, the word Signature is not only used in IndexSignature, but also has the concept of Signature in many common programming languages. For example, the type signature in C++:

char c;
double d;

// 他的签名为 (int) (char, double)
int retVal = (*fPtr)(c, d);

Through the above type signature, although we do not know the specific definition of the function that this function pointer may point to in the future, but through this signature, we can see how the function pointed to by this pointer is used. It takes char and double as incoming parameters, The return value is int, and this signature also constrains the function that the pointer points to in the future. It can only point to functions that take char and double as incoming parameters and return an int. Similar concepts are reflected in the Rust language. In Rust, we can directly use a function with the following signature:

// add 方法的签名 fn(i32, i32) -> i32
fn add(left: i32, right: i32) -> i32 { left + right }

// sub 方法的签名 fn(i32, i32) -> i32
fn sub(left: i32, right: i32) -> i32 { left - right }

// 通过方法签名，我们可以为某一类结构相近的方法提供工厂。
fn select(name: &str) -> fn(i32, i32) -> i32 {
    match name {
        "add" => add,
        "sub" => sub,
        _ => unimplemented!(),
    }
}

fn main() {
    let fun = select("add");
    println!("{} + {} = {}", 1, 2, fun(1, 2));
}

Let’s look at type signatures in Java again:

It can be seen that the core idea is the same as the type signature in C/C++/Rust. It outlines how a method is used by describing the types of incoming parameters and return values of the method, without caring about the specific implementation of the method.

There is also the concept of type signatures in Python/Golang, and the core ideas are the same, so I won’t repeat them here.

By understanding the type signatures of these programming languages, we know that the signature (Signature) actually describes the same thing as the type (Type), and the thing described by the type (Type) is a collection of certain properties. It can be considered that their type (Type) is the same; and the signature (Signature) can be regarded as a composite type composed of multiple types (Type).

for example:

int32 a = 0; // a 的类型 (type) 是 int32

It can be seen that the type (type) of the above variable a is int32. As soon as you hear int32, you will think of some properties of a, such as: 32 bits, integers, ranges, etc., int32 is the general term for these properties , the next time we encounter a variable b, as long as its properties conform to the properties of int32, we can classify them into one category, that is, they are all variables of type int32.

However, in the type system of programming languages, there are not only variables, but also a very important thing – methods.

int add(int a, int b) {
    return a+b;
}

Now, we need a thing to describe the type of the above method, that is, there needs to be a thing to distinguish what kind of method belongs to the same class as the add method. name? I’m afraid not, because the following two methods with the same name feel completely different.

// 两个数相加
int add(int a, int b) {
    return a+b;
}

// 两个数组合并
int[] add(int a[], int b[]) {
    return a.append(b);
}

Therefore, when the big guys designed the language, they decided to use the combination of the return value and the type of the parameter list to define the type of a method, that is:

// 两个数相加
// 类型为 (int, int) -> int
int add(int a, int b) {
    return a+b;
}

// 两个数组合并
// 类型为 (int[], int[]) -> int[]
int[] add(int a[], int b[]) {
    return a.append(b);
}

The signature can be understood as a composite type formed by combining multiple types. This signature is used to describe the type of the method, which can be called the method signature (Method/Function Signature). So, writing to now, through analogy, you can guess what the index signature is probably. As mentioned earlier, the index can be regarded as a method, input a value, and return the thing it points to.

2.3 IndexSignature

As mentioned above, the index can be regarded as a pointer or a method. The signature can be understood as a composite type formed by combining multiple types. The index signature describes the type of the index. Writing here, I have some doubts in my mind. Isn’t the index signature the type of the index? Why should the index be described by a composite type? Can a common type (type) describe the type of the index?

a[0] Isn’t the type of this index Integer?
hash.get(“name”) Isn’t the type of this index String?

This problem stems from the deviation of the understanding of the index,

a[0] The index of is not 0, his index is 0->a[0]``, 即输入0，返回 [0]`.
hash.get("name") The index of is also not “name”, his index is “name”->"Jack"enter “name” to return “Jack”.

At this point, in fact, friends who use various programming languages should feel that they may have been exposed to index signatures more or less, but they didn’t care what his name was at the time. The reason why I say this is because I When I wrote this, I thought of the hashmap used when developing java before:

public class RunoobTest {
    public static void main(String[] args) {
        HashMap<string, string> Sites = new HashMap<string, string>();

        Sites.put("one", "Google");
        Sites.put("two", "Runoob");
        Sites.put("three", "Taobao");
        Sites.put("four", "Zhihu");

        System.out.println(Sites);
    }
}

In line 7 of the above code HashMap Sites = new HashMap(), can be understood as an index signature, which defines the type of the index in this HashMap structure, is to input a string and return a string. The index signature of an array is similar, except that the compiler automatically omits the index signature of an array for us, that is, the input type must be int, so we don’t need to write it manually.

// 显式索引签名：Array<int, int> a = [0, 1, 2]
int[] a = [0, 1, 2];

// 显式索引签名：Array<int, String> a = ["0", "1", "2"]
String[] a = ["0", "1"];

3. Index signatures in some languages

The idea of index signatures has a long history and can even be traced back to the programming conventions set by programmers for the readability of programs in the early years. When we stipulated that the name of an integer variable must start with i, it has actually been is defining the signature of an index that points to an integer.

int i_user_id = 10; // 整型以i开头，定义了 <i开头的字符串, int> 的索引签名
float f_user_weight = 120.3; // 浮点以f开头，定义了 <f开头的字符串, float> 的索引签名

However, the specification may not be followed by everyone. When the name of the index becomes part of the programming element and can be manipulated dynamically, it is not very appropriate to use the index signature as a specification.

// 当出现可以动态添加索引的编程元素。
const a = {
    "name": "Jack"
}

// 你和你的小伙伴约定好，年龄的索引就是“age”。
// 他在某个地方add("age", 10)。
a.add("age", 10);

// 你在某个地方,需要这个年龄。
a.get("age");

// 如果索引签名是编程规约，而不带有强制性。
// 你的小伙伴恰恰手一滑，眼一闭，写错了也没看到 warning。
a.add("aeg", 10);

// 那你这边就只能看到空指针异常了。
NullPointerException: "age" is not define.

Therefore, in order to improve the stability of the program and avoid this unnecessary risk, some general-purpose programming languages (such as TypeScript) and domain languages (such as: KCL, CUE) began to expose the index signature as a language feature to developers, aiming to In providing security and stability in the programming process, the impact of the above problems is reduced.

3.1 TypeScript Index Signatures

In TS, we can define an object in the following way:

const salary1 = {
  baseSalary: 100_000,
  yearlyBonus: 20_000
};

According to our description of the index above, we know that this object has two indexes, and their type, that is, the index signature, should be the same, that is, they are the same kind of index.

const salary1 = {
  baseSalary: 100_000, // 索引1 : 输入“baseSalary”，返回100_000
  yearlyBonus: 20_000 // 索引2 : 输入”yearlyBonus“， 返回20_000
};

TS provides a feature that enables developers to write such index signatures,

interface NumbersNames {
  [key: string]: string // 索引的类型为输入String，返回String
}

const names: NumbersNames = {
  '1': 'one',
  '2': 'two',
  '3': 'three',
  // etc...
  '5': 'five'  // Error: 这个索引的输入类型为int，类型不匹配。
};

3.2 CUE index signature

CUE supports writing regular expressions in index signatures, and supports checking index names.

a: {
    foo:    string    // 索引foo 返回值是string类型。
    [=~"^i"]: int     // 以i开头的索引，返回值都是int。
    [=~"^b"]: bool    // 以b开头的索引，返回值都是bool。
    ...string         // 其他的所有的索引返回值都是string。
}

b: a & {
    i3:    3          // 索引i3以i开头，返回值是3类型为int。
    bar:   true       // 索引bar以b开头，返回值true类型为bool。
    other: "a string" // 其他索引的返回值类型都是字符串。
}

3.3 KCL index signature

The KCL index signature has the form [<attr_name>: <index_type>]: , semantically means that the keys of all attributes in the structure can only be of type , and the values can only be <value_type> type

The type at `“ of the index signature can only be str, int, float, not union type
<value_type> It can be any legal data type of KCL, including schema type and union type.
<attr_name> Indicates any KCL legal identifier, and can be omitted or not written, generally used in combination with check

Basic usage

schema definition method

schema Map:
    [str]: str

Note that schemas that use index signatures are relaxed by default.
An index signature can only be defined once in a schema.

Advanced usage

Type signature writing default value

schema Map:
    [str]: str = {"default_key": "default_value"}

Mixed with schema definition to force schema for all attributes key, value type:

schema Person:
    name: str
    age: int  # error, 与[str]: str语义冲突，
    [str]: str  # schema所有属性的值只能为字符串类型

Schema attributes and index signatures can be defined in the schema at the same time, which are usually used to express the type constraints of additional attributes in the schema, forcing all attributes except the schema to define key and value types.

schema Person:
    name: str
    age?: int
    [...str]: str  # 表示除name, age之外，其余schema属性必须为字符串类型，属性的值也必须为字符串类型

Attribute name used with check

schema Data:
    [dataName: str]: str

    check:
        dataName in ["Alice", "Bob", "John"]

data = Data {
    Alice: "10"
    Bob: "12"
    Jonn: "8"  # error Jonn not in ["Alice", "Bob", "John"]
}

Note: KCL index signatures do not currently support union types and literal types.
Note: The index signature does not currently support checking the value of value, only type checking.
Note: Index signature does not support regular verification similar to CUE[“$b^”] because it belongs to the runtime check and is not part of the type system. It is not easy to combine the type check from the runtime stage, so it is not supported for the time being.

4. Summary

This article briefly introduces index signatures. By sorting out the concepts of index and signature, and comparing the ideas of signatures used in some general programming languages and domain languages, it generally describes the general appearance of index signatures, hoping to help everyone to be more Easily understand the concept of index signature. The content of the article is only the author’s personal understanding of index signature. If there is something wrong or inappropriate, please correct me.

Reference link

#index #signature #programming #language #Personal #page #chai2010 #News Fast Delivery

What is an index signature in a programming language? – Personal page of chai2010 – News Fast Delivery