profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/brson/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

brson/basic-http-server 302

A simple static HTTP server in Rust, for learning and local doc development

brson/annotated-std-rs 65

An annotation of the Rust standard library

brson/archaea 15

Historic Rust code browsing

brson/big_s 15

Rust's missing `String` literal

brson/being-rust 14

Intro to Rust talk

alegalle/rustgl_4_2_core 8

Opengl 3.2 - 4.2 bindings for rust

brson/android-plugin 2

An sbt plugin for Android development in Scala

Aimeedeer/bigannouncement 1

bigannouncement.eth

alisha17/tic-tac-toe 1

Command line and Interactive Tic-Tac-Toe in Rust

startedbrson/stdx

started time in 11 hours

startedbrson/stdx

started time in a day

startedbrson/rust-anthology

started time in 2 days

startedbrson/stdx

started time in 2 days

startedbrson/rust-anthology

started time in 2 days

Pull request review commentbrson/basic-http-server

Multiple Improvements

 pub struct Config {     #[structopt(name = "ROOT", parse(from_os_str), default_value = ".")]     root_dir: PathBuf, -    /// Enable developer extensions.+    /// Enable developer extensions     #[structopt(short = "x")]     use_extensions: bool,++    /// Allow serving files outside the root given by ROOT, meaining all your files are accessible.+    ///+    /// This allows access to *all files* on your computer, so don't use this on untrusted networks+    /// like the internet.+    #[structopt(long = "allow-escape-root")]+    allow_escape_root: bool,++    /// Enable basic http auth with the given password+    #[structopt(long = "auth", parse(try_from_str))]+    auth: Option<Auth>,+}++impl Config {+    /// Ensure that the `root_dir` is a canonical absolute path with no `.` or `..` in.+    fn canonical_root_dir(&mut self) -> Result<()> {+        // This line of code takes what might be a relative path and returns an absolute path+        // without any `.` or `..` in. If the path then points to a symbolic link, the code then+        // follows the link, repeating until it finds something which is not a symbolic link. This+        // is what is set as the `root_dir`.+        //+        // Doing this makes it possible to report the real root directory in the log, and also+        // makes checking that a file is actually in the root directory more robust.+        let canonical_root_dir = self.root_dir.canonicalize()?;+        self.root_dir = canonical_root_dir;+        Ok(())+    }++    /// Checks if the given path is in the root dir.+    ///+    /// If it is, return its canonical representation. If it isn't, return an error. This function+    /// will error if the path does not point to an actual file or directory.+    fn check_in_root_dir(&self, path: PathBuf) -> Result<PathBuf> {+        let path = path.canonicalize()?;++        // Skip the check if we've configured to allow files outside the root.+        if self.allow_escape_root || path.starts_with(&self.root_dir) {+            Ok(path)+        } else {+            return Err(Error::EntityNotInRoot);+        }+    }++    /// Check if the request has the required password (if we set one).+    fn check_auth(&self, req: &Request<Body>) -> bool {+        // This macro avoids us having to write out a match for every step.+        macro_rules! err_to_ret {+            ($e:expr) => {+                match $e {+                    Ok(v) => v,+                    Err(_) => return false,+                }+            };+        }++        let reference_auth = match self.auth.as_ref() {+            Some(auth) => auth,+            // If there is no password, carry on serving the request.+            None => return true,+        };++        // Get and decode the auth token+        let headers = req.headers();+        let auth_header = match headers.get(header::AUTHORIZATION) {+            Some(header) => header,+            // If the header isn't set, then send a request for auth.+            None => return false,+        };+        let auth_header = err_to_ret!(auth_header.to_str());++        if !matches!(auth_header.get(..6), Some(s) if s.eq_ignore_ascii_case("basic ")) {+            return false;+        }+        let auth = match auth_header.get(6..).map(|s| s.trim()) {+            Some(auth) => auth,+            None => return false,+        };+        let auth = err_to_ret!(base64::decode(auth));+        let auth = err_to_ret!(str::from_utf8(&auth));+        let auth: Auth = err_to_ret!(auth.parse());+        *reference_auth == auth+    }+}++#[derive(Clone, PartialEq)]+struct Auth {+    username: String,+    password: String,+}++impl std::str::FromStr for Auth {+    type Err = &'static str;+    fn from_str(s: &str) -> std::result::Result<Self, Self::Err> {

I've made this parser very lenient, to the point that it cannot fail! I think it's still a good idea to leave it in FromStr, so it could be made less lenient in the future by only altering code here.

derekdreery

comment created time in 3 days

Pull request review commentbrson/basic-http-server

Multiple Improvements

 async fn serve(config: Config, req: Request<Body>) -> Response<Body> {  /// Handle all types of requests, but don't deal with transforming internal /// errors to HTTP error responses.-async fn serve_or_error(config: Config, req: Request<Body>) -> Result<Response<Body>> {+async fn serve_or_error(config: Arc<Config>, req: Request<Body>) -> Result<Response<Body>> {     // This server only supports the GET method. Return an appropriate     // response otherwise.     if let Some(resp) = handle_unsupported_request(&req) {         return resp;     } +    // If there is a password, check the password. Return unauthorized if it was missing/incorrect+    if !config.check_auth(&req) {+        let mut auth_value =+            HeaderValue::from_static(r#"Basic relm="User Visible Realm", charset="UTF-8""#);+        auth_value.set_sensitive(true);+        let mut headers = HeaderMap::new();+        headers.insert(header::WWW_AUTHENTICATE, auth_value);+        return make_error_response_from_code_and_headers(StatusCode::UNAUTHORIZED, headers);+    }+     // Serve the requested file.-    let resp = serve_file(&req, &config.root_dir).await;+    // Here we pass a `&PathBuf` to a function expecting a `&Path`. This works because of *deref+    // coercions*, in this case meaning that `PathBuf` implements `Deref` with `Target=Path`.

This comment is out of date. Needs removing.

derekdreery

comment created time in 3 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++For instance, a full-text seraching engine will persist the origin document and n-gram index on TiKV. It'll be a waste of resource if we read back and then update the index from a client. In contrary, the coprocessor plugin can generate the index from the origin document, and update the index inplace. What's more, the coprocessor plugin can perform index scan directly on TiKV.++The goals of the coprocessor plugin are:++- Do what client can do (on single region)+- Provide more guarantee than client does on RawKV+    - Raft transaction on RawKV+- Easy to use+    - Out of box+    - Easy to deploy+- Robust+    - Easy to debug+    - Log support, metrics support++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++In this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Plugin runtime++The plugin runtime is a new component settling in `tikv::server::service::kv::Service`. It loads the dylib plugin, dispatches coprocessor request to the plugin, and proxy the API calls from plugin to the [`Storage`](https://tikv.github.io/doc/tikv/storage/struct.Storage.html).++The path of the plugin should be specified in the config file and be loaded at TiKV startup.++### Plugin SDK++The plugin SDK is a standalone rust library that help setup the build process for the plugin.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### Protobuf design++```proto+message RawCoprocessorRequest {+    kvrpcpb.Context context = 1;++    string copr_name = 2;+    string copr_version_constraint = 3;++    bytes data = 4;+}++message RawCoprocessorResponse {+    bytes data = 1;++    errorpb.Error region_error = 2;+    string other_error = 3;+}+```++### API design++To reduce the learning overhead, it'll be better that the API of the coprocessor plugin get closer to the client. Thus, it'll looks like the `RawClient` in the [Rust Client](https://github.com/tikv/client-rust) with extra txn-like methods e.g. `commit` and `lock`.++```rust+use std::ops::Range;++pub type Key = Vec<u8>;+pub type Value = Vec<u8>;+pub type KvPair = (Key, Value);++#[derive(Debug)]+pub struct Region {+    id: u64,+    start_key: Key,+    end_key: Key,+    region_epoch: RegionEpoch,+}++#[derive(Debug)]+pub struct RegionEpoch {+    pub conf_ver: u64,+    pub version: u64,+}++#[derive(Debug)]+pub enum Error {+    KeyNotInRegion { key: Key, region: Region },+    // More+}++pub type Result<T> = std::result::Result<T, Error>;++#[async_trait]+pub trait RawStorage: Send {+    async fn get(&self, key: Key) -> Result<Option<Value>>;+    async fn batch_get(&self, keys: Vec<Key>) -> Result<Vec<KvPair>>;+    async fn scan(&self, key_range: Range<Key>) -> Result<Vec<Value>>;+    async fn put(&mut self, key: Key, value: Value) -> Result<()>;+    async fn batch_put(&mut self, kv_pairs: Vec<KvPair>) -> Result<()>;+    async fn delete(&mut self, key: Key) -> Result<()>;+    async fn batch_delete(&mut self, keys: Vec<Key>) -> Result<()>;+    async fn delete_range(&mut self, key_range: Range<Key>) -> Result<()>;+}++pub trait Coprocessor: Send + Sync {+    fn on_raw_coprocessor_request(+        &self,+        region: Region,+        request: Vec<u8>,+        storage: Box<dyn RawStorage>,+    ) -> Result<Vec<u8>>;+```++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++## Future work

could you elaborate more about the testing, for example:

  • correctness testing, to ensure the plugin framework API is working as expected.
  • performance testing, to ensure the performance is as expected.
andylokandy

comment created time in 3 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++For instance, a full-text seraching engine will persist the origin document and n-gram index on TiKV. It'll be a waste of resource if we read back and then update the index from a client. In contrary, the coprocessor plugin can generate the index from the origin document, and update the index inplace. What's more, the coprocessor plugin can perform index scan directly on TiKV.++The goals of the coprocessor plugin are:++- Do what client can do (on single region)+- Provide more guarantee than client does on RawKV+    - Raft transaction on RawKV+- Easy to use+    - Out of box+    - Easy to deploy+- Robust+    - Easy to debug+    - Log support, metrics support++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++In this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Plugin runtime++The plugin runtime is a new component settling in `tikv::server::service::kv::Service`. It loads the dylib plugin, dispatches coprocessor request to the plugin, and proxy the API calls from plugin to the [`Storage`](https://tikv.github.io/doc/tikv/storage/struct.Storage.html).++The path of the plugin should be specified in the config file and be loaded at TiKV startup.++### Plugin SDK++The plugin SDK is a standalone rust library that help setup the build process for the plugin.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### Protobuf design++```proto+message RawCoprocessorRequest {+    kvrpcpb.Context context = 1;++    string copr_name = 2;+    string copr_version_constraint = 3;++    bytes data = 4;+}++message RawCoprocessorResponse {+    bytes data = 1;++    errorpb.Error region_error = 2;+    string other_error = 3;+}+```++### API design++To reduce the learning overhead, it'll be better that the API of the coprocessor plugin get closer to the client. Thus, it'll looks like the `RawClient` in the [Rust Client](https://github.com/tikv/client-rust) with extra txn-like methods e.g. `commit` and `lock`.++```rust+use std::ops::Range;++pub type Key = Vec<u8>;+pub type Value = Vec<u8>;+pub type KvPair = (Key, Value);++#[derive(Debug)]+pub struct Region {+    id: u64,+    start_key: Key,+    end_key: Key,+    region_epoch: RegionEpoch,+}++#[derive(Debug)]+pub struct RegionEpoch {+    pub conf_ver: u64,+    pub version: u64,+}++#[derive(Debug)]+pub enum Error {+    KeyNotInRegion { key: Key, region: Region },+    // More+}++pub type Result<T> = std::result::Result<T, Error>;++#[async_trait]+pub trait RawStorage: Send {+    async fn get(&self, key: Key) -> Result<Option<Value>>;+    async fn batch_get(&self, keys: Vec<Key>) -> Result<Vec<KvPair>>;+    async fn scan(&self, key_range: Range<Key>) -> Result<Vec<Value>>;+    async fn put(&mut self, key: Key, value: Value) -> Result<()>;+    async fn batch_put(&mut self, kv_pairs: Vec<KvPair>) -> Result<()>;+    async fn delete(&mut self, key: Key) -> Result<()>;+    async fn batch_delete(&mut self, keys: Vec<Key>) -> Result<()>;+    async fn delete_range(&mut self, key_range: Range<Key>) -> Result<()>;+}++pub trait Coprocessor: Send + Sync {+    fn on_raw_coprocessor_request(+        &self,+        region: Region,+        request: Vec<u8>,+        storage: Box<dyn RawStorage>,+    ) -> Result<Vec<u8>>;+```++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.

should we consider the coprocessor plugin after the key range feature is implemented?

andylokandy

comment created time in 3 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### API design+

for the API spec, what's the expected performance reduction?

andylokandy

comment created time in 3 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++For instance, a full-text seraching engine will persist the origin document and n-gram index on TiKV. It'll be a waste of resource if we read back and then update the index from a client. In contrary, the coprocessor plugin can generate the index from the origin document, and update the index inplace. What's more, the coprocessor plugin can perform index scan directly on TiKV.++The goals of the coprocessor plugin are:++- Do what client can do (on single region)+- Provide more guarantee than client does on RawKV+    - Raft transaction on RawKV+- Easy to use+    - Out of box+    - Easy to deploy+- Robust+    - Easy to debug+    - Log support, metrics support++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++In this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Plugin runtime++The plugin runtime is a new component settling in `tikv::server::service::kv::Service`. It loads the dylib plugin, dispatches coprocessor request to the plugin, and proxy the API calls from plugin to the [`Storage`](https://tikv.github.io/doc/tikv/storage/struct.Storage.html).

could you elaborate more about the implementation of the plugin runtime? for example:

  • how to implement a plugin under this framework?
  • how to compile a plugin to a dynamic library?
  • how to load the dynamic plugin library to the tikv-server process?
  • is it possible to upgrade the dynamic plugin to a newer version without restarting the tikv-process?
andylokandy

comment created time in 3 days

PR opened brson/basic-http-server

Multiple Improvements

I've bundled up all my previous PRs into a single PR and added basic auth. I needed all this stuff for some use of my own, but hope that the work is useful to others. This PR includes

  • Protection against accessing files or folders outside of the root directory (and a switch to turn it off)
  • Turn a panic into an error when a socket is already in use.
  • URLs not found are redirected to root, unless they look like a file. The 'unless they look like a file' part I added because I found it confusing when I'd misspelled a file name and got the index.html rather than a not found.
  • A very liberal impl of basic http auth, which will accept empty usernames, passwords. This isn't designed to be secure: I intent to use it to keep a website embargoed, but I don't care if someone guesses the password: there isn't anything private on the website. It shouldn't be used for securing things properly.

Closes #23 #22 #19

+684 -525

0 comment

4 changed files

pr created time in 3 days

startedbrson/my-rust-lists

started time in 4 days

push eventbrson/tikv

lhy1024

commit sha 4c7ab163386d00c5dad98d8c8191e356b348663e

Fix unexpected panic when sampling in load base split (#9720) Signed-off-by: lhy1024 <admin@liudos.us> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: close https://github.com/pingcap/tidb/issues/22963 <!-- REMOVE this line if no issue to close --> ### Related changes - Need to cherry-pick to the release branch ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test ### Release note <!-- bugfixes or new feature need a release note --> - No Release note

view details

wangggong

commit sha 43f452f40e8abe15bf1b4802275acf9b0b94c579

copr: activate quote & add null_time_diff (#9724) <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: #9016 <!-- REMOVE this line if no issue to close --> Problem Summary: ### What is changed and how it works? What's Changed: - Add `Quote` in lib.rs - add `NullTimeDiff` ### Related changes ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test Side effects ### Release note <!-- bugfixes or new feature need a release note --> No release note.

view details

lysu

commit sha 01731cb14d6899413be73853433663b2ed09861a

copr: read value when common handle only contain prefix column (#9670) Signed-off-by: lysu <sulifx@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: close https://github.com/pingcap/tidb/issues/22811 Problem Summary: tikv will avoid read value when required column in common handle but common handle's columns maybe not contain full columns value (like `primary key(c(1))`) ### What is changed and how it works? What's Changed: read kv's value when required column in common handle is the prefix of column need merge https://github.com/pingcap/tidb/pull/22829 first ### Related changes - n/a ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Integration test(WIP) Side effects - n/a ### Release note <!-- bugfixes or new feature need a release note --> - no release note <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/tikv/tikv/9670) <!-- Reviewable:end -->

view details

Zhuomin Liu

commit sha 2fd0489a90ab332410f825bfe7dc9cdb013f68e7

copr: support analyze common handle and columns by scan once table rows (#9175) Signed-off-by: lzmhhh123 <lzmhhh123@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: close https://github.com/pingcap/tidb/issues/20243 related changes: - https://github.com/pingcap/tipb/pull/197 - https://github.com/pingcap/tidb/pull/21381 ### What is changed and how it works? What's Changed: ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test - Integration test Side effects - Performance regression - Consumes more MEM ### Release note <!-- bugfixes or new feature need a release note --> - none.

view details

wangggong

commit sha 993ec6b818e20404cad2234c6314209a19f1074a

copr: add instr (#9646) Signed-off-by: wangggong <793160615@qq.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: #9016 <!-- REMOVE this line if no issue to close --> Problem Summary: ### What is changed and how it works? What's Changed: ### Related changes - PR to update `pingcap/docs`/`pingcap/docs-cn`: - PR to update `pingcap/tidb-ansible`: - Need to cherry-pick to the release branch ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test Side effects ### Release note <!-- bugfixes or new feature need a release note --> No release note.

view details

Jay

commit sha 47e03ae7906e7c32208ea01521416bae6c66ad30

*: update dependencies (#9715) Signed-off-by: Jay Lee <BusyJayLee@gmail.com> Co-authored-by: qupeng <qupeng@pingcap.com>

view details

Jay

commit sha 9c285ee79973ef8adf25bb20dadd34171a1abf14

*: fix build on CI (#9738) ### What problem does this PR solve? grpcio-sys is updated to a new version that can detect compiler changes correctly, so it will recompile from outdated caches. openssl-src is replaced with an old version, because 1.1.1g to 1.1.1j makes the check on self signed keys stricter, which can affect current users. For example, upgrading an existing cluster can disconnect the updated nodes due to certs error. So this PR downgrades openssl to wait for a good solution to the problem and not blocking 5.0.0. ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test - Integration test - Manual test (add detailed scripts or steps below) ### Release note <!-- bugfixes or new feature need a release note --> - No release note.

view details

5kbpers

commit sha 6d8ab2fea128885dd6da294b67154c5a851d6b9f

resolved_ts: add scanner (#9700) Signed-off-by: 5kbpers <tangminghua@pingcap.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Problem Summary: Separate from #9435 Introduce resolved ts key/value scanner. ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test - Integration test ### Release note <!-- bugfixes or new feature need a release note --> * N/A

view details

ti-srebot

commit sha 9356fd57e62267cbe59f93fa3a9315c2ae788823

rust-rocksdb: rocksdb: Fix the bug that the key manager is not updated during the Rename (#9736) update rust-rocksdb to include tikv/rust-rocksdb#613 for master ### Release note - No release note

view details

Jay

commit sha 2cee812f236e3134c9df8b37e73ac51a5f90690f

raftstore: disable hibernate region except on master (#9735) ### What problem does this PR solve? Problem Summary: This PR adds a function to check if the binary is built on master branch. Hibernate region is an experimental features that should not be enabled on release branch, but it's enabled by accident on 5.0.0-rc. Checking branch name at build time can prevent such mistake from happening again. The side effect of the PR is binary built in PR will also have hibernate regions disabled even it's checkout from master. ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test - Integration test ### Release note <!-- bugfixes or new feature need a release note --> - disable hibernate region by default as it's still unstable

view details

wjHuang

commit sha 32ec29d2b447c5eb7a9a7ee8474359b1dfa70046

copr: support decoding the new index format (#9383) Signed-off-by: wjhuang2016 <huangwenjun1997@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Refer to https://github.com/pingcap/tidb/pull/20220 ### What is changed and how it works? Proposal: [xxx](url) <!-- REMOVE this line if not applicable --> What's Changed: ### Related changes ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test Side effects ### Release note - No release note

view details

Zijie Lu

commit sha f810a2923e5e4a07e1ce9202f6e570b3782d4518

raftstore: consider tick count in lease check (#9737) Signed-off-by: Zijie Lu <wslzj40@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Issue Number: close #9728 <!-- REMOVE this line if no issue to close --> Problem Summary: ### What is changed and how it works? What's Changed: consider tick count in lease check ### Related changes - PR to update `pingcap/docs`/`pingcap/docs-cn`: - PR to update `pingcap/tidb-ansible`: ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test Side effects - Performance regression - Consumes more CPU - Consumes more MEM ### Release note <!-- bugfixes or new feature need a release note --> - No release note

view details

Lei Zhao

commit sha 4beb480b390a1a379191b65dd8bcf1b9ebb25b7d

storage: record key and start_ts in committed error (#9743) Signed-off-by: youjiali1995 <zlwgx1023@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Problem Summary: I need to debug an issue of resolveing lock, but I can't get the start_ts and key from the `Committed` error: ``` resolveLocks failed: unexpected resolve err: abort:"Txn(Mvcc(Committed { commit_ts: TimeStamp(423311069165125656) }))" ``` ### What is changed and how it works? What's Changed: record key and start_ts in `Committed` error ### Related changes ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - No code Side effects - Performance regression - Consumes more CPU ### Release note <!-- bugfixes or new feature need a release note --> - No release note

view details

Lei Zhao

commit sha f684d82cdb9e30876ca78c09d63990345a83f29a

txn_types: ignore unknown bytes when parsing Write/Lock (#9659) Signed-off-by: youjiali1995 <zlwgx1023@gmail.com> <!-- Thank you for contributing to TiKV! If you haven't already, please read TiKV's [CONTRIBUTING](https://github.com/tikv/tikv/blob/master/CONTRIBUTING.md) document. If you're unsure about anything, just ask; somebody should be along to answer within a day or two. PR Title Format: 1. module [, module2, module3]: what's changed 2. *: what's changed If you want to open the **Challenge Program** pull request, please use the following template: https://raw.githubusercontent.com/tikv/.github/master/.github/PULL_REQUEST_TEMPLATE/challenge-program.md You can use it with query parameters: https://github.com/tikv/tikv/compare/master...${you branch}?template=challenge-program.md --> ### What problem does this PR solve? Problem Summary: To make downgrading easier, Ignore unknown bytes when parsing Write/Lock. https://github.com/tikv/sig-transaction/pull/93 ### What is changed and how it works? What's Changed: Ignore unknown bytes when parsing Write/Lock. ### Related changes ### Check List <!--REMOVE the items that are not applicable--> Tests <!-- At least one of them must be included. --> - Unit test ### Release note <!-- bugfixes or new feature need a release note --> - No release note.

view details

Yilin Chen

commit sha 311515d67331dee4bf1f34ebeb6200bd6061f711

raftstore: re-propose read index with no lock checking response (#9721) * raftstore: re-propose read index with no lock checking response Signed-off-by: Yilin Chen <sticnarf@gmail.com> * add integration tests Signed-off-by: Yilin Chen <sticnarf@gmail.com> * fix metrics Signed-off-by: Yilin Chen <sticnarf@gmail.com> * resend read index by sending raft command through router Signed-off-by: Yilin Chen <sticnarf@gmail.com>

view details

Yilin Chen

commit sha d45d32191babae83bffd9bdd7d563db14192f8b5

concurrency manager: fix a memory leak when iterating skiplist (#9722) Signed-off-by: Yilin Chen <sticnarf@gmail.com> Co-authored-by: Ti Chi Robot <71242396+ti-chi-bot@users.noreply.github.com>

view details

qupeng

commit sha 5d03967eddf1723339b8ac09fa88ddc0dcb3acba

ignore max_row_versions in check_need_gc (#9703) * ignore max_row_versions in check_need_gc Signed-off-by: qupeng <qupeng@pingcap.com> * add a unit test Signed-off-by: qupeng <qupeng@pingcap.com> * fix the new test case Signed-off-by: qupeng <qupeng@pingcap.com> * address comments Signed-off-by: qupeng <qupeng@pingcap.com> * remove useless code Signed-off-by: qupeng <qupeng@pingcap.com>

view details

lhy1024

commit sha f6070318b96461518817f5f030b2b0b5b0e1f9a0

Add test for panic in load-base-split (#9742) Signed-off-by: lhy1024 <admin@liudos.us> Co-authored-by: Ti Chi Robot <71242396+ti-chi-bot@users.noreply.github.com>

view details

MyonKeminta

commit sha 86ddcfcf4d6891616b9b2381314d2ef182e44630

config: Make labels' format constaint consistent with PD (#9745) * config: Make labels' format constaints consistent with PD Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> * Address comments Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com> Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com> Co-authored-by: Ti Chi Robot <71242396+ti-chi-bot@users.noreply.github.com>

view details

aimeedeer

commit sha f66186e52bdf88f91be8bed8bba29094283d3359

Merge remote-tracking branch 'origin/master' into engine_traits_tests-checkpoint

view details

push time in 4 days

startedbrson/rust-anthology

started time in 5 days

fork NickSchmitt/my-rust-lists

Lightly organized personal notes about Rust

fork in 5 days

startedbrson/rust-anthology

started time in 5 days

pull request commenttikv/rfcs

RFC: Coprocessor Plugin

@nrc RawTransaction is removed from this RFC. It deserves an individual RFC later.

andylokandy

comment created time in 6 days

startedbrson/basic-http-server

started time in 7 days

startedbrson/rust-anthology

started time in 7 days

startedbrson/rust-anthology

started time in 8 days

startedbrson/stdx

started time in 8 days

startedbrson/stdx

started time in 9 days

startedbrson/rust-anthology

started time in 10 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### API design++To reduce the learning overhead, it'll be better that the API of the coprocessor plugin get closer to the client. Thus, it'll looks like the `RawClient` in the [Rust Client](https://github.com/tikv/client-rust) with extra txn-like methods e.g. `commit` and `lock`.++```rust+pub type Key = Vec<u8>;+pub type Value = Vec<u8>;++#[derive(Debug)]+pub struct Region {+    id: u64,+    start_key: Key,+    end_key: Key,+    region_epoch: RegionEpoch,+}++#[derive(Debug)]+pub struct RegionEpoch {+    pub conf_ver: u64,+    pub version: u64,+}++#[derive(Debug)]+pub enum Error {+    KeyNotInRegion { key: Key, region: Region },+    Deadlock { key: Key },+    // More+}++pub type Result<T> = std::result::Result<T, Error>;++#[async_trait]+pub trait RawTransaction: Send {+    /// Acquire memory lock for a key. All other trivial rpc requests or coprocessor+    /// lock acquire to the same key will be blocked by this lock. This lock will be+    /// released when the transaction is committed or dropped.+    async fn lock(&self, key: Key) -> Result<()>;++    async fn get(&self, key: Key) -> Result<Option<Value>>;+    async fn scan(&self, key_range: Range<Key>) -> Vec<Value>;++    async fn put(&mut self, key: Key, value: Value) -> Result<()>;+    async fn delete(&mut self, key: Key) -> Result<()>;+    async fn delete_range(&mut self, key_range: Range<Key>) -> Result<()>;++    /// Returns when Raft message applied successfully.+    async fn commit(self) -> Result<()>;

why are the 'building up' methods async?

It's because every mutation will acquire the memory lock.

andylokandy

comment created time in 10 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.

Updated in doc

andylokandy

comment created time in 11 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.

Updated in doc

andylokandy

comment created time in 11 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation

Updated the doc

andylokandy

comment created time in 11 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### API design++To reduce the learning overhead, it'll be better that the API of the coprocessor plugin get closer to the client. Thus, it'll looks like the `RawClient` in the [Rust Client](https://github.com/tikv/client-rust) with extra txn-like methods e.g. `commit` and `lock`.++```rust+pub type Key = Vec<u8>;+pub type Value = Vec<u8>;++#[derive(Debug)]+pub struct Region {+    id: u64,+    start_key: Key,+    end_key: Key,+    region_epoch: RegionEpoch,+}++#[derive(Debug)]+pub struct RegionEpoch {+    pub conf_ver: u64,+    pub version: u64,+}++#[derive(Debug)]+pub enum Error {+    KeyNotInRegion { key: Key, region: Region },+    Deadlock { key: Key },+    // More+}++pub type Result<T> = std::result::Result<T, Error>;++#[async_trait]+pub trait RawTransaction: Send {

Updated the section

andylokandy

comment created time in 11 days

Pull request review commenttikv/rfcs

RFC: Coprocessor Plugin

+# Coprocessor Plugin++## Summary++Add a general and pluggable coprocessor framework for RawKV mode.++## Motivation++TiKV is the storage component in the TiDB ecosystem, however, the distribution computation principle suggests that computation should be as close to the data source as possible. Therefore, TiKV has embedded a subset of the TiDB executor framework to push down some computation tasks when applicable.++But TiKV's capability should be far beyond that, as many distributed components can be built on top of TiKV, such as cache, full text search engine, graph database and NoSQL database. And same as TiDB, these product will also like to push down specific computation to TiKV, which requires the coprocessor to be customizable, aka pluggable.++## Detailed design++### Dynamic vs statically++Generally, there are two strategies to build a plugin framework: dynamically and statically, which means to load the plugin on startup or to embed in the binary on compilation.++![Plugin Arch](../media/plugin-arch.png)++They have both pros and cons:++| Static | Dynamic |+| -- | -- |+| ◯ High performance | X Relatively slower |+| ◯ Easy to deploy | X Complexify the deploy process |+| X Build the entire TiKV | ◯ Easy to build |+| X Build very slow | ◯ Build fast |+| X Hard to debug | ◯ Easy to debug |++Ideally, we'd like to develop a plugin in dynamic mode, and eventually, to distribute the statically linked one in release. So there is an interesting research area that designing a plugin framework that can write the plugin once, and compile to the dynamic one and the static one with duplicate code. Anyway, this is an ambitious target but sort of out of scope, so we may explore the possibility when marching on.++So initially, in this RFC, we'll only focus on the dynamic plugin framework that works in RawKV mode.++### Web Assembly++Web Assembly is chosen to host the dynamic plugin. There was alternatives like dynamic lib, lua and bpf. So in this section, I'll explain the tradeoff between them and why WASM is the most appropriate choice for TiKV:++- Dynamic Library++    Previously, a proof-of-concept plugin experiment [[repo]](https://github.com/andylokandy/plugin) is done. As a result, we found that Rust's unstable ABI is a risk in safety. It's hard to guarantee that the TiKV and the plugin has absolutely the same ABI version, and is also hard to debug such a plugin.++- Lua++    TOO SLOW++- eBPF++    Berkeley Packet Filter has been experimented in Hackathon. It requires the plugin to be written in C, which is not capable to migrate `tidb_query` eventually. And also, eBPF is not turing-complete, which is unacceptable.++- WASM++    Web Assembly has also been experimented in Hackathon. As the result, it made the performance score around 50% to 80% to the statically linked one. Good work for `Wasmer`! Besides, WASM can be written in many languages and has great safety guarantee.++    The Rust binding for WASM plugin should be first-class supported.++### Keyspace++Keyspace[[RFC]](https://github.com/tikv/rfcs/pull/39)[[The most updated design doc]](https://docs.google.com/document/d/1x17-urAqToDo8TVXJroEHtc76fdssFaoANjSaNDhjKg/edit) is an incoming feature of TiKV that is highly related to coprocessor plugin. Keyspace determines whether a range of key should only be used in transaction mode or in RawKV mode. Since coprocessor works in either RawKV mode or txn mode, surely coprocessor plugin framework should aware of Keyspace. The details is TBD.++### Multi-plugin++Currently TiKV has only one coprocessor `tidb_query`. However, without further work on statically linked plugin and txn mode support, we can't strip it from official release. So, multiple coprocessor has to be supported. Basically, we may need to add a `gPRC` rpc for coprocessor v2 request, in which coprocessor name and version is given, so that TiKV will be able to dispatch the request to the proper coprocessor, as well as to reject the request on version mismatch.++### API design++To reduce the learning overhead, it'll be better that the API of the coprocessor plugin get closer to the client. Thus, it'll looks like the `RawClient` in the [Rust Client](https://github.com/tikv/client-rust) with extra txn-like methods e.g. `commit` and `lock`.++```rust+pub type Key = Vec<u8>;+pub type Value = Vec<u8>;++#[derive(Debug)]+pub struct Region {+    id: u64,+    start_key: Key,+    end_key: Key,+    region_epoch: RegionEpoch,+}++#[derive(Debug)]+pub struct RegionEpoch {+    pub conf_ver: u64,+    pub version: u64,+}++#[derive(Debug)]+pub enum Error {+    KeyNotInRegion { key: Key, region: Region },+    Deadlock { key: Key },+    // More+}++pub type Result<T> = std::result::Result<T, Error>;++#[async_trait]+pub trait RawTransaction: Send {+    /// Acquire memory lock for a key. All other trivial rpc requests or coprocessor+    /// lock acquire to the same key will be blocked by this lock. This lock will be+    /// released when the transaction is committed or dropped.+    async fn lock(&self, key: Key) -> Result<()>;++    async fn get(&self, key: Key) -> Result<Option<Value>>;+    async fn scan(&self, key_range: Range<Key>) -> Vec<Value>;++    async fn put(&mut self, key: Key, value: Value) -> Result<()>;+    async fn delete(&mut self, key: Key) -> Result<()>;+    async fn delete_range(&mut self, key_range: Range<Key>) -> Result<()>;++    /// Returns when Raft message applied successfully.+    async fn commit(self) -> Result<()>;

Updated in the doc.

andylokandy

comment created time in 11 days