Editor’s note:

For a long time, the contribution of individual developers in the open source community has been roughly measured by an invisible ruler, such as vague personal popularity, convincing power, etc., or titles such as Maintainer and Committer bestowed by the open source community.Correspondingly, how do individual developers benefit from open source contributions and how to accurately evaluate this benefit?,There is no quantifiable measurement indicator for a long time.

In order to more accurately evaluate the degree of personal contribution and influence in open source projects, X-lab open laboratory Zhao Shengyu tried to establish a specific measurement system to measure the contributions of developers by building a collaborative network in open source projects .Zhao Shengyu believes that contribution evaluation is also a good supplement and support for the spirit of open source. At the same time, the matching of effort and return,Persistence in Open Source Collaborationsex,open source communityscalechangedevelop,alsoAllis beneficial.

At the end of the article, Zhao Shengyu also invites everyone to participate in a simple poll survey, hoping that everyone can leave their own choices and opinions.

Guest: Zhao Shengyu X-lab core member

News Fast Delivery: Why did you think of establishing a measurement system to evaluate individual contributions in open source projects?

Zhao Shengyu:In the past few decades, open source has brought great changes to the software industry, especially basic software, and has greatly promoted the development of the software industry. In stark contrast, authors of open source software cannot profit from the value they create, or even make ends meet.

Although I mentioned before that open source, as a large-scale collaboration method, requires a complete closed value loop to sustain healthy development, and I also called out “In a world of open collaboration, every contribution is worth rewarding! “, but even if the commercialization and profitability of the open source project itself are solved, how to effectively distribute this part of the benefits to specific developers is still a problem.

In fact, the contradiction between the highly free flow of productivity in the open source world and the production relationship that relies on the employment of traditional companies has become increasingly prominent. How to quantify the contribution of floating developers in the community is the most important part of solving the value distribution of open source. ring.

Although most of the contributions within the firmofThe analysis method will use code changes or work order piece counting methods to count contributions, but I still hope to mine the characteristics of social development in open source and use collaborative relationships to build a morescienceAn effective measurement method can better solve the problem of value distribution in open source.

News Fast Delivery:we are looking atarriveThe developers’ contribution evaluation in the project is basicallyYesStatistics obtained by the number of Issues, PRs, or lines of code submitted by developers,howWhat about “using collaborative relationships to build a more effective measurement method”?

Zhao Shengyu:

Then let’s take search engines as an example to talk about the calculation method of webpage ranking. In fact, in the earliest search engines, everyone would use the content of web pages as the basis for web page rankings. Although the relevance of search is very good, the quality ranking of web pages is still a problem.Google’s PageRank algorithm uses the external link relationship between web pages to evaluate the quality of web pages. The basic assumption is that high-quality pages are more likely to be cited by other high-quality pagesthis algorithm brings an efficient and high-quality solution to webpage quality ranking without paying attention to the webpage content itself.

Collaboration networks are similar, in factMy basic hypothesis is that high-impact developers are more likely to attract other high-impact developers to collaborate with themsuch as in-depth communication on the same Issue or discussions on code details on PR, etc.

In the global collaboration network, we use projects and developers as nodes, and build a huge collaboration network with activity as the side. Think of it as a collaboration.

inside the project,The most intuitive is to use Issue and PR as the basic collaboration unit, and the discussion in the same Issue or PR is regarded as a kind of collaboration. So we get the following collaboration network diagram in a project:

As a result, a fine-grained collaboration network within an open source project is built, and then use the OpenRank algorithm to calculate the value of each developer, each Issue and PR. Similar to PageRank, we don’t need to pay attention to the content of each discussion and code submission in specific Issues and PRs, but the influence of a developer can be calculated through the collaborative relationship between developers.

But different from PageRank, we not only consider relational information, but Issue and PR also have initial value, which plays an important role in correcting and finely judging the contribution of developers.

News Fast Delivery: Issue and PRHow is the initial value determined??

Zhao Shengyu:The initial value calculation of Issue and PR needs to be explained, because in a specific project, unlike global projects, we can formulate community rules so that members in the community can express more value tendencies on a normal basis, thereby helping We better judge the value of each node.

In the initial value calculation of Issue and PR, we use low-cost GitHub reactions, that is, all developers can evaluate the reactions of Issue and PR, such as 👍 2 times, ❤️ 3 times, 🚀 4 times, so if If a developer uses three reactions for evaluation, he can increase the base ratio of an Issue or PR to the default 9 times at most.

But here we have different evaluation weights for different developers. The impact of a developer’s evaluation ratio on the final ratio is different from the last month in the projectinfluencerelated.As a developer reported last month onInfluenceThe proportion is 20%, so if he gives an evaluation of 9 times multiplier on an Issue, but only he gives the evaluation, the final basic multiplication ratio of this Issue will be 0.2∗9+(1−0.2)=2.6 . Therefore, 9 times seems to be a very high magnification, but in fact, the high evaluation of individual accounts will not have a very large impact.

In other words, this is a decentralized evaluation system. If each developer wants their own evaluation to be more meaningful, they need their own experience in the project.InfluenceBigger ones are fine. If a developer has no contribution to the project at all, his reactions evaluation will not affect the result at all.

So it can be considered as a kind of expert experience, but the experts here must be people who have made in-depth contributions to the project.

News Fast Delivery: How can developers refer to this evaluation system to improve their ownInfluence?Will there be a situation where the number of Issues and PRs are used to increase personal influence?

Zhao Shengyu:Just like a saying in management: “What you assess, you will get”, then under the above mathematical model and calculation method, if a developer wants to get a higher status in the communityInfluence,What should he do, and what changes will these actions bring to the community?

Increase the frequency of collaboration with developers with high influence in the community

Since the collaboration network is built on collaboration units such as Issue and PR, the simplest strategy is to improve collaboration with developers with high influence in the communityfrequency.

  • Developers just entering the communitythe simplest way is toin the communityhigh impactInteract in the developer’s Issue or PRWhether it is for discussion or to review their code and ask some questions, it is all right.
  • Developers who already have some knowledge of the project,butA better strategy is to attract core developers to collaborate with you through your own high-quality Issues and PRs, such as participating in discussions on your own questions, or reviewing your own PRs, etc.. So there are already some value guides for developers here, that is, they need to find ways to communicate and cooperate with core developers in the community, instead of blindly raising low-quality Issues or PRs. If no one is willing to discuss with you, then you It is still difficult to increase the influence of
  • For core developers in the project,Its strategy is actually to have a collaborative relationship with more developers, that is, to communicate and collaborate with other developers as much as possible to consolidate its core position.

Improve the quality of your contributions in the community

The previous item is the frequency of encouraging collaboration, but this item places more emphasis on the quality of contributions. Since Issue and PR have initial value, this initial value is actually very important to enhance the influence of developers.

So if a developer hopes that the core developers in the community will like their Issue and PR, so as to obtain higher initial value, then they have to find ways to improve the quality of their contributions to win the recognition of other developers .

In fact, when it is actually put into operation, we will find that this value guidance will bring additional benefits, that is, normalized mutual praise, which can very well activate the atmosphere of the community.After all, many times there are not so many serious discussions that can be carried out. This kind of praise has also become a social means while evaluating, makingIn a more serious development environment,There is a more harmonious atmosphere among developers in the communitythis subtle change was a surprise to me.

Contribute for as long as possible

In this algorithm, the initial influence of any developer who has just entered the community is 1, which means that if a developer only has a short-term contribution, it must be difficult to rise to a certain height, and since the previous one will be inherited every month Monthly influence is used as the initial value, then long-term contribution basically means the long-term growth of influence.

Once the contribution is interrupted, the influence will gradually decay at a rate of 85%. In fact, the reason why we choose a low ratio of 85% here instead of directly clearing it is to prevent contributors from being lost due to accidental interruption of contributions. A speed of 85% means that stopping contributions for 4 months still retains more than 50% of the original influence, while not contributing for a year still retains 15% of influence. Then when developers return to the community at any time, they can quickly continue the previous influence without having to start all over again.

other risks

In fact, for any algorithm, it is impossible to completely avoid the list. In fact, if a developer submits a large number of meaningless Issues or PRs, he can also slightly increase his influence value in the project in the short term, but obviously this is not beneficial to the project. of.

Therefore, this algorithm itself is not intended to completely replace maintainers to judge the contribution of developers, but a reference and guide. If destructive behavior occurs, we can block or even expel those speculative developers through community norms.

In fact, trust is the basis of low-cost and efficient collaboration. In a future where data is more and more perfect and open, personal credit will also become more and more important. Speculative developers will not only be unable to benefit in the short term from a project, but may even be seriously affected. Harmful to his personal long-term development, this is something that needs to be considered in the design of the system. In fact, here we can learn a lot from the design principles of some existing social systems.

News Fast Delivery: Has this algorithm mechanism been implemented in any open source communities? What changes has it brought to the community and developers?

Zhao Shengyu: You must eat your own dog food first. In fact, our program has already been approved by X-lab openThe laboratory has been in place for about half a year. We try our best to make the work of the laboratory online and deposit it on the GitHub platform, and then use this algorithm to judge the contributions of the students in the laboratory projects and provide a monthly report. Additional assistance.

Judging from the global influence of all projects in the laboratory, the overall improvement is quite obvious, with a steady growth every month. From the perspective of the project that I am mainly responsible for, the enthusiasm of the students to participate has been greatly improved, and due to the characteristics of the algorithm itself, there are not many students who come to brush Issues, but a few more long-term and in-depth participation in the project classmates.

Mr. Wang Wei, the founder of X-lab open laboratory, also concluded: “During this period of operation, even the community culture of the X-lab open laboratory has undergone some subtle changes: everyone is more willing to use GitHub’s open collaboration mode to operate their work, such as reading and sharing papers; more willing Participate in the project discussion and like the emoticons, even in seconds; I am more willing to invite some other students to discuss and collaborate together, including actively telling everyone that a new function has been released or a new bug has been fixed; I am more active for myself The participating projects are advertised and publicized, and new participants are developed. Everyone’s growth is visible!

In addition, we also have in-depth cooperation with some external communities. The Sealos community also uses our algorithm to give feedback to external developers in the community. According to the feedback, the effect is quite good. The proportion of external contributions has been steadily increasing, and it has also cultivated Several external long-term contributors.

So from the current point of view, I am very confident in the future application prospects of this algorithm, but it may need to be implemented in more scenarios to observe the effect.

News Fast Delivery: Using a set of specific data and frameworks to evaluate developers’ contributions, is this contrary to the Geek and shared open source spirit we advocate?

Zhao Shengyu:I think contribution evaluation is a good supplement and support for the spirit of open source, and it will not be contrary to the open source spirit of Geek and sharing.

The spirit of open source emphasizes sharing and dedication, but the effectiveness of the economic system comes from the matching of effort and return.Perhaps the early geeks could open source excellent projects regardless of rewards, or regard the benefits of reputation as a reward, but when open source is promoted on a large scale as an open collaboration model, and open source software is accompanied by cloud and other models When we start to unlock great value, we must consider the problems that need to be solved in its long-term development.

The core idea of ​​OpenRank is different from traditional software data analysis. It does not emphasize mathematical models and statistical characteristics, butMore consideration is given to the social attributes in collaboration, which is also highly compatible with the open source open collaboration model.

I believe that in a future where efforts and returns can be better unified and coordinated, open source will surely bring immeasurable changes to the world on a larger scale of collaboration.

Interactive Q&A: If an open source community you participate in uses this algorithm to publicly rank contributors, would you support it?

Welcome to tell the answer in the comment area, we will randomly pick out 3 small partners to give a beautiful lucky bag as a gift!

#method #calculate #contribution #open #source #developers #Yijuns #personal #space #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *