profile
viewpoint
Nikhil Benesch benesch @MaterializeInc New York, NY Systems engineer

ballista-compute/sqlparser-rs 472

Extensible SQL Lexer and Parser for Rust

benesch/backport 2

automatically backport pull requests

benesch/adspygoogle.dfp 1

setuptools fork of Google DoubleClick for Publishers API Python Client

benesch/autouseradd 1

👨‍🚒 put out fires started by `docker run --user`

benesch/abomonation 0

A mortifying serialization library for Rust

benesch/abomonation_derive 0

A macros 1.1 #[derive(Abomonation)] implementation for the abomonation crate

benesch/Amethyst 0

Tiling window manager for OS X à la xmonad.

issue commentMaterializeInc/materialize

Docs: Document minimum requirements to run materialize

The start of this documentation is here: https://materialize.io/docs/ops/deployment/. It's not yet very specific about hardware requirements because we don't have the necessary experience to state any rules of thumb. But we are certainly interested in coming up with those rules of thumb over time.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - Basic installer for materialize

It is preferable that whatever ships in apt or yum packages is also available in the tarball available for download. An end-user may not have the ability to install a package (lacks privileges) or not able to use the materialize repo (company policy).

I'm not sure I understand the ask here. The same binary that's in the apt package is in the tarball. If the goal is to install the systemd unit file into the right spot, that's something that requires root privileges, so if you don't have privileges to install apt packages, you almost certainly do not have privileges to install systemd unit files. Plus the correct location for the systemd unit file is inherently platform specific—not all Linuxes even use systemd—so it's not really something that makes sense to include in a tarball that attempts to be usable on nearly all Linux distributions.

Rather than have the user run materialize and then figure out that the environment is not right, the installer can prevent this at the onset.

So, in a world with an installer, you have to run the installer, and then run materialized. In the current world, you just run materialized. In either case, the first command that you run will tell you whether things are configured correctly. The installer just adds an extra step to the process.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - materialized - mimic --data-directory for log directory as well

Hmm, I'm not sure I understand what the problem with the current design is. Is it just that we don't ensure that the parent directories of the specified log file exist? Your proposal would make things a bit less flexible by requiring that the file be named materialized.log, and I'm not sure why that restriction would be good? Plus most users are not presently expected to configure the log file, and simply leave it in its default location.

I can definitely imagine that we will need to support "Enterprise Grade Logging" at some point, where we rotate log files (#4599) and can target a variety of logging sinks, like syslog, have slow query logs, etc. But I'm not sure we understand the common deployment strategies enough to be able to nail the design of that feature right now.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - Implement log rotation

I agree this is a nice to have, but I'd like to propose that this be considered low priority, because a sufficiently dedicated user can wire this up themselves with standard programs like logrotate.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

DOCS - document the subsystems in materialize such as dataflow, coord, pgwire, sql_parser

Logs are a great way to educate end-users. By exposing more information in logs and documenting it, we help customers teach themselves.

I'd like to gently push back on this claim. In my opinion, logs, especially info/debug logs, are the last resort for educating end users. They are primarily intended to help developers diagnose problems, with the secondary goal of being useful to the most sophisticated of users. I do not think it should be a goal for most users to be able to understand most log messages, with the exception of errors/warnings.

In general error log messages indicate a deficiency in proper error reporting channels, like the issue with CSV sources that you pointed out. Practically every error log should be converted to a runtime error that is surfaced through SQL, where is it much more accessible to users. To a first approximation, whenever we currently might expect a user to understand a log message, I'd prefer to try to surface that information outside of the logs instead.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX: Startup script for materialize

It's a systemd unit file, and the location into which it should be installed varies from platform to platform. I'm not sure it makes sense to include it in the tarball. In a world where we have an RPM package, I think we should encourage folks to use either the RPM or apt package rather than the tarball. The tarball is mostly intended for experimentation.

krishmanoh2

comment created time in a day

PullRequestReviewEvent

issue commentMaterializeInc/materialize

UX - default to as of now() and run the select if/when unable to determine a timestamp (selecting from non materialized sources)

The reasons for this behavior are complicated, and I agree the UX is bad, but AS OF now() does not actually do what you want in most cases. If you are just trying Materialize on for size, typically you want to run CREATE MATERIALIZED SOURCE, not CREATE SOURCE.

krishmanoh2

comment created time in a day

delete branch benesch/materialize

delete branch : peek-materialize

delete time in a day

pull request commentrusoto/rusoto

Disable chrono default features

rusoto-credential depends on clock, but I got rid of as many features as I could when I resubmitted this as https://github.com/rusoto/rusoto/pull/1829!

benesch

comment created time in a day

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 25ae54327e19710891c752d58a1be31b9808964e

sql: remove dead materialized parameter for peeks In the old days, PEEKs were not permitted to cause a dataflow to be created, while SELECTs were. Put another way: PEEKs were guaranteed to take the fast path, while SELECTs could optionally take the slow path if they were too complicated. The `materialized: bool` parameter to Plan::Peek captured this distinction. But PEEK was removed in 878b715f5, so we no longer need it; its value is always implicitly true.

view details

Nikhil Benesch

commit sha 559a6398367cec08aa90ccbfe027ff87f9476784

Merge pull request #4606 from benesch/peek-materialize sql: remove dead materialized parameter for peeks

view details

push time in a day

PR merged MaterializeInc/materialize

sql: remove dead materialized parameter for peeks

In the old days, PEEKs were not permitted to cause a dataflow to be created, while SELECTs were. Put another way: PEEKs were guaranteed to take the fast path, while SELECTs could optionally take the slow path if they were too complicated.

The materialized: bool parameter to Plan::Peek captured this distinction. But PEEK was removed in 878b715f5, so we no longer need it; its value is always implicitly true.

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+3 -16

0 comment

3 changed files

benesch

pr closed time in a day

push eventbenesch/materialize

Nikhil Benesch

commit sha 25ae54327e19710891c752d58a1be31b9808964e

sql: remove dead materialized parameter for peeks In the old days, PEEKs were not permitted to cause a dataflow to be created, while SELECTs were. Put another way: PEEKs were guaranteed to take the fast path, while SELECTs could optionally take the slow path if they were too complicated. The `materialized: bool` parameter to Plan::Peek captured this distinction. But PEEK was removed in 878b715f5, so we no longer need it; its value is always implicitly true.

view details

push time in a day

PR opened MaterializeInc/materialize

Reviewers
sql: remove dead materialized parameter for peeks

In the old days, PEEKs were not permitted to cause a dataflow to be created, while SELECTs were. Put another way: PEEKs were guaranteed to take the fast path, while SELECTs could optionally take the slow path if they were too complicated.

The materialized: bool parameter to Plan::Peek captured this distinction. But PEEK was removed in 878b715f5, so we no longer need it; its value is always implicitly true.

+2 -12

0 comment

3 changed files

pr created time in a day

create barnchbenesch/materialize

branch : peek-materialize

created branch time in a day

issue commentMaterializeInc/materialize

Docs: Implement copy button for commands to improve docs usability and reduce copy/paste errors

This is a great idea, and something we've been meaning to do for a while.

krishmanoh2

comment created time in a day

delete branch benesch/materialize

delete branch : p-help

delete time in a day

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 817f96d5522e07512c5892d6714f2d45620bbb8f

materialized: sync description of -p/-n opts with docs The docs are a bit more clear than the help text for the distributed mode options (-p and -n), so make the help text match the docs. Fix #4600.

view details

Nikhil Benesch

commit sha de86882339e26ff8ec713f2f77116f22b7cfa7d4

Merge pull request #4602 from benesch/p-help materialized: sync description of -p/-n opts with docs

view details

push time in a day

PR merged MaterializeInc/materialize

materialized: sync description of -p/-n opts with docs

The docs are a bit more clear than the help text for the distributed mode options (-p and -n), so make the help text match the docs.

Fix #4600.

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+2 -2

0 comment

1 changed file

benesch

pr closed time in a day

issue closedMaterializeInc/materialize

UX - materialized -h help syntax not in sync with documentation

materialized - h Help syntax not in sync with documentation

-p, --process INDEX identity of this process (default 0)
-n, --processes N   total number of processes (default 1)

Docs has it correct -

Screenshot from 2020-10-21 09-44-40

@rjnn

closed time in a day

krishmanoh2

PR opened MaterializeInc/materialize

materialized: sync description of -p/-n opts with docs

The docs are a bit more clear than the help text for the distributed mode options (-p and -n), so make the help text match the docs.

Fix #4600.

+2 -2

0 comment

1 changed file

pr created time in a day

create barnchbenesch/materialize

branch : p-help

created branch time in a day

issue commentMaterializeInc/materialize

UX - remove the word DANGEROUS and change "enable experimental features" to "enable beta features"

Experimental features are dangerous. They might corrupt your data or brick your cluster. They are not to be used in production. The current wording is correct, in my opinion.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - materialized -h help syntax not in sync with documentation

What is out of sync here? They both look correct to me.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - Create and use a config file for materialize

The reason we don't presently support a config file is because deployment strategies that use Docker and Kubernetes make the config file experience subpar. Config files work great when you are installing software yourself, manually, on a Linux box. But when you're using the materialized Docker image, it's kind of a pain to inject a config file into the image. It's much easier to just set command-line flags in your Docker/Kubernetes configuration.

We may need to support a config file down the road, but I don't think the extra complexity we'd need to parse and validate a config file is sufficiently justified yet. There are plenty of ways to ensure consistent startup without a config file, i.e., by adding the desired configuration as command-line options in a startup script.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX: Startup script for materialize

Discussed a bit in https://github.com/MaterializeInc/materialize/issues/4595#issuecomment-715456922. There is already a systemd unit file that ships with the apt package.

krishmanoh2

comment created time in a day

issue commentMaterializeInc/materialize

UX - Basic installer for materialize

This is an intentional design decision. Materialize is easy enough to install and configure that I do not believe an interactive installer is justified. If there are specific deficiencies in the installation experience, let's see if we can address those in the current framework! Specifically there are a few asks here.

  • Describing how to configure Materialize. I think this is already well described in our CLI docs: https://materialize.io/docs/cli/
  • Validating the environment. We can just do this whenever you start up the materialized binary. We already do a bit of this to warn you if your nofiles ulimit is too low.
  • Capturing anonymized user information. We've started doing some of this with an automatic upgrade check that runs on startup. It would be quite straightforward to also send up information about the user's system as part of that check.
  • Startup scripts. This already exists if you install via the apt package. sudo systemctl start materialized will launch Materialize with a default configuration, for example.
  • Nonprivileged user. Again this is automatically created by the apt package.

If the ask here is to provide an RPM, that's definitely something we intend to do, to complement the apt package. Would that address your concerns?

krishmanoh2

comment created time in a day

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 533735e21210633ff4fe3ff200ae76c89e57a9b6

sql: split plan_homogeneous_exprs Cleave plan_homogeneous_exprs into two parts: plan_exprs and coerce_homogeneous_exprs. This separation is quite clean. It makes room for the next commit, in which a function wants to call coerce_homogeneous_exprs with pre-planned expressions.

view details

Nikhil Benesch

commit sha 4f8cc13244441228d60bf42cc9e38d7d98b56ff6

sql: eagerly coerce arrays and lists In PostgreSQL, a bare string literal like '42' will have type "unknown". If that string literal is used in an expression like '42' + 1 PostgreSQL will "coerce" it to an integer, and the expression is therefore well-typed. ("unknown" types are confusing, though, so we use something slightly different in Materialize: we explicitly represent expressions whose type is undetermined via the CoercibleScalarExpression type.) A similar situation occurs with bare ARRAY expressions, like: ARRAY['1', '2'] Unlike string literals, in PostgreSQL, ARRAY expressions are eagerly coerced, and so this expression is immediately determined to have type "text[]". This determination is made by forcibly coercing the elements with, so ARRAY[1, '2'] will be assigned type "int[]", since the usual rules for homogeneous coercion apply to the elements. This poses a bit of a problem for empty arrays like ARRAY[] since there are no elements that can determine the element type. That expression alone yields a "cannot determine type of empty array" error. So PostgreSQL special cases direct casts of ARRAY expressions, as in ARRAY[]::int[] so that you can at least construct empty arrays of a desired type. More complicated cases, like `function_accepting_int_array(ARRAY[])`, are not handled, and result in the same "cannot determine type of empty array" error. Previously, Materialize was a bit too smart, and typed empty ARRAY expressions as "unknown", precisely so that something like `function_accepting_int_array(ARRAY[])` would be accepted. These extra smarts interact poorly with polymorphic function arguments (#4582), however, as resolving polymorphic function arguments requires knowing the concrete array/list types involved. So, just do it exactly like PostgreSQL does it. ARRAY and LIST expressions are now eagerly coerced, with one special case for when an ARRAY/LIST expression is directly cast to a different type. While this is a bit less expressive, it results in only minor regressions in our test suite, and support for polymorphic function arguments is well worth it.

view details

Nikhil Benesch

commit sha f14a8c1f507ac096d3a925a1b458595ceeaf8ec3

Merge pull request #4589 from MaterializeInc/array-list-coercion sql: eagerly coerce arrays and lists

view details

push time in 2 days

delete branch MaterializeInc/materialize

delete branch : array-list-coercion

delete time in 2 days

PR merged MaterializeInc/materialize

sql: eagerly coerce arrays and lists

In PostgreSQL, a bare string literal like

'42'

will have type "unknown". If that string literal is used in an expression like

'42' + 1

PostgreSQL will "coerce" it to an integer, and the expression is therefore well-typed.

("unknown" types are confusing, though, so we use something slightly different in Materialize: we explicitly represent expressions whose type is undetermined via the CoercibleScalarExpression type.)

A similar situation occurs with bare ARRAY expressions, like:

ARRAY['1', '2']

Unlike string literals, in PostgreSQL, ARRAY expressions are eagerly coerced, and so this expression is immediately determined to have type "text[]". This determination is made by forcibly coercing the elements with, so

ARRAY[1, '2']

will be assigned type "int[]", since the usual rules for homogeneous coercion apply to the elements. This poses a bit of a problem for empty arrays like

ARRAY[]

since there are no elements that can determine the element type. That expression alone yields a "cannot determine type of empty array" error.

So PostgreSQL special cases direct casts of ARRAY expressions, as in

ARRAY[]::int[]

so that you can at least construct empty arrays of a desired type. More complicated cases, like function_accepting_int_array(ARRAY[]), are not handled, and result in the same "cannot determine type of empty array" error.

Previously, Materialize was a bit too smart, and typed empty ARRAY expressions as "unknown", precisely so that something like function_accepting_int_array(ARRAY[]) would be accepted. These extra smarts interact poorly with polymorphic function arguments (#4582), however, as resolving polymorphic function arguments requires knowing the concrete array/list types involved.

So, just do it exactly like PostgreSQL does it. ARRAY and LIST expressions are now eagerly coerced, with one special case for when an ARRAY/LIST expression is directly cast to a different type. While this is a bit less expressive, it results in only minor regressions in our test suite, and support for polymorphic function arguments is well worth it.

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+148 -133

0 comment

5 changed files

benesch

pr closed time in 2 days

Pull request review commentMaterializeInc/materialize

sql: eagerly coerce arrays and lists

 pub fn plan_expr<'a>(ecx: &'a ExprContext, e: &Expr) -> Result<CoercibleScalarEx     }) } -/// Plans a list of expressions such that all input expressions will be cast to-/// the same type. If successful, returns a new list of expressions in the same-/// order as the input, where each expression has the appropriate casts to make-/// them all of a uniform type.+/// Plans a list of expressions.

No, happy to! Thanks for the quick review.

benesch

comment created time in 2 days

PullRequestReviewEvent

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 4f8cc13244441228d60bf42cc9e38d7d98b56ff6

sql: eagerly coerce arrays and lists In PostgreSQL, a bare string literal like '42' will have type "unknown". If that string literal is used in an expression like '42' + 1 PostgreSQL will "coerce" it to an integer, and the expression is therefore well-typed. ("unknown" types are confusing, though, so we use something slightly different in Materialize: we explicitly represent expressions whose type is undetermined via the CoercibleScalarExpression type.) A similar situation occurs with bare ARRAY expressions, like: ARRAY['1', '2'] Unlike string literals, in PostgreSQL, ARRAY expressions are eagerly coerced, and so this expression is immediately determined to have type "text[]". This determination is made by forcibly coercing the elements with, so ARRAY[1, '2'] will be assigned type "int[]", since the usual rules for homogeneous coercion apply to the elements. This poses a bit of a problem for empty arrays like ARRAY[] since there are no elements that can determine the element type. That expression alone yields a "cannot determine type of empty array" error. So PostgreSQL special cases direct casts of ARRAY expressions, as in ARRAY[]::int[] so that you can at least construct empty arrays of a desired type. More complicated cases, like `function_accepting_int_array(ARRAY[])`, are not handled, and result in the same "cannot determine type of empty array" error. Previously, Materialize was a bit too smart, and typed empty ARRAY expressions as "unknown", precisely so that something like `function_accepting_int_array(ARRAY[])` would be accepted. These extra smarts interact poorly with polymorphic function arguments (#4582), however, as resolving polymorphic function arguments requires knowing the concrete array/list types involved. So, just do it exactly like PostgreSQL does it. ARRAY and LIST expressions are now eagerly coerced, with one special case for when an ARRAY/LIST expression is directly cast to a different type. While this is a bit less expressive, it results in only minor regressions in our test suite, and support for polymorphic function arguments is well worth it.

view details

push time in 2 days

issue commentfede1024/rust-rdkafka

Overriding ClientContext::stats()?

You're welcome! It's definitely not the most obvious design, but seems to work well once folks figure it out.

durch

comment created time in 2 days

pull request commentMaterializeInc/materialize

expr: add regexp_match function

Still need to figure out better names for UnaryFunc::MatchRegex (returns bool) and UnaryFunc::RegexpMatch (returns text[]), but this is probably ready for an initial review, @umanwizard.

benesch

comment created time in 2 days

PR opened MaterializeInc/materialize

expr: add regexp_match function

Provides for basic string manipulation capabilities. Matches the PostgreSQL function of the same name, except that the regexps we support are a bit different.

Fix #4545.

+182 -13

0 comment

7 changed files

pr created time in 2 days

create barnchbenesch/materialize

branch : regexp-match

created branch time in 2 days

PullRequestReviewEvent

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha a423ac943b1918dd26f8cde22208244dadaf96a1

sql: eagerly coerce arrays and lists In PostgreSQL, a bare string literal like '42' will have type "unknown". If that string literal is used in an expression like '42' + 1 PostgreSQL will "coerce" it to an integer, and the expression is therefore well-typed. ("unknown" types are confusing, though, so we use something slightly different in Materialize: we explicitly represent expressions whose type is undetermined via the CoercibleScalarExpression type.) A similar situation occurs with bare ARRAY expressions, like: ARRAY['1', '2'] Unlike string literals, in PostgreSQL, ARRAY expressions are eagerly coerced, and so this expression is immediately determined to have type "text[]". This determination is made by forcibly coercing the elements with, so ARRAY[1, '2'] will be assigned type "int[]", since the usual rules for homogeneous coercion apply to the elements. This poses a bit of a problem for empty arrays like ARRAY[] since there are no elements that can determine the element type. That expression alone yields a "cannot determine type of empty array" error. So PostgreSQL special cases direct casts of ARRAY expressions, as in ARRAY[]::int[] so that you can at least construct empty arrays of a desired type. More complicated cases, like `function_accepting_int_array(ARRAY[])`, are not handled, and result in the same "cannot determine type of empty array" error. Previously, Materialize was a bit too smart, and typed empty ARRAY expressions as "unknown", precisely so that something like `function_accepting_int_array(ARRAY[])` would be accepted. These extra smarts interact poorly with polymorphic function arguments (#4582), however, as resolving polymorphic function arguments requires knowing the concrete array/list types involved. So, just do it exactly like PostgreSQL does it. ARRAY and LIST expressions are now eagerly coerced, with one special case for when an ARRAY/LIST expression is directly cast to a different type. While this is a bit less expressive, it results in only minor regressions in our test suite, and support for polymorphic function arguments is well worth it.

view details

push time in 2 days

PR opened MaterializeInc/materialize

Reviewers
sql: eagerly coerce arrays and lists

In PostgreSQL, a bare string literal like

'42'

will have type "unknown". If that string literal is used in an expression like

'42' + 1

PostgreSQL will "coerce" it to an integer, and the expression is therefore well-typed.

("unknown" types are confusing, though, so we use something slightly different in Materialize: we explicitly represent expressions whose type is undetermined via the CoercibleScalarExpression type.)

A similar situation occurs with bare ARRAY expressions, like:

ARRAY['1', '2']

Unlike string literals, in PostgreSQL, ARRAY expressions are eagerly coerced, and so this expression is immediately determined to have type "text[]". This determination is made by forcibly coercing the elements with, so

ARRAY[1, '2']

will be assigned type "int[]", since the usual rules for homogeneous coercion apply to the elements. This poses a bit of a problem for empty arrays like

ARRAY[]

since there are no elements that can determine the element type. That expression alone yields a "cannot determine type of empty array" error.

So PostgreSQL special cases direct casts of ARRAY expressions, as in

ARRAY[]::int[]

so that you can at least construct empty arrays of a desired type. More complicated cases, like function_accepting_int_array(ARRAY[]), are not handled, and result in the same "cannot determine type of empty array" error.

Previously, Materialize was a bit "too" smart, and typed empty ARRAY expressions as "unknown", precisely so that something like function_accepting_int_array(ARRAY[]) would be accepted. These extra smarts interact poorly with polymorphic function arguments (#4582), however, as resolving polymorphic function arguments requires knowing the concrete array/list types involved.

So, just do it exactly like PostgreSQL does it. ARRAY and LIST expressions are now eagerly coerced, with one special case for when an ARRAY/LIST expression is directly cast to a different type. While this is a bit less expressive, it results in only minor regressions in our test suite, and support for polymorphic function arguments is well worth it.

+143 -133

0 comment

5 changed files

pr created time in 2 days

create barnchMaterializeInc/materialize

branch : array-list-coercion

created branch time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentMaterializeInc/materialize

Add an emacs testdrive mode

+;;; mz-testdrive.el --- Major Mode for testdrive files++;; Copyright Materialize, Inc. All rights reserved.+;;+;; Use of this software is governed by the Business Source License+;; included in the LICENSE file at the root of this repository.+;;+;; As of the Change Date specified in that file, in accordance with+;; the Business Source License, use of this software will be governed+;; by the Apache License, Version 2.0.++;; Author: Brandon W Maister <bwm@materialize.io

Missing >.

quodlibetor

comment created time in 2 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

issue commentfede1024/rust-rdkafka

Overriding ClientContext::stats()?

Something like this should work:

use rdkafka::client::ClientContext;
use rdkafka::config::{ClientConfig};
use rdkafka::consumer::stream_consumer::StreamConsumer;

struct CustomContext;

impl ClientContext for CustomContext {
     fn stats(&self, stats: Statistics) { 
         let stats_str = format!("{:?}", stats); 
         println!("Stats received: {} bytes", stats_str.len()); 
     } 
}

fn main() {
    let consumer: StreamConsumer<CustomContext> = ClientConfig::new().create_with_context(CustomContext);
}

You shouldn't need to interact with StreamConsumerContext at all yourself. It'll be automatically created on your behalf when you call create_with_context.

durch

comment created time in 2 days

issue commentfede1024/rust-rdkafka

Overriding ClientContext::stats()?

Oh, I see. I thought StreamConsumerContext was a already a newtype in your project. I see now that it refers to the struct exported by this crate. The way rust-rdkafka is designed you are intended to create a new type that implements ClientContext. Let me send you an example momentarily.

durch

comment created time in 2 days

issue commentrust-analyzer/rust-analyzer

Latest nightly seems to ignore semantic highlighting settings

@benesch there are several options:

  • you can revert to an older version of the extension
  • you can build the extension from source with grammar removed cargo xtask install --client=code
  • rust-analyzer also explicitly encourages forks, so you are free to publish an alternative extension to crates.io

Thanks for the suggestions. Those are all reasonable, but I'm on the hunt for a solution that allows me to continue to use the latest and greatest rust-analyzer without too much work on my end.

So here's the hack that seems to be working for me:

curl https://github.com/microsoft/vscode/blob/28b6143cb20edb22712acb45eee32c910ea45172/extensions/rust/syntaxes/rust.tmLanguage.json > ~/.vscode/extensions/matklad.rust-analyzer-0.2.352/rust.tmGrammar.json 

If you're using VSCode Remote like I am, you can tweak the command slightly like so and run it on the remote machine:

curl https://github.com/microsoft/vscode/blob/28b6143cb20edb22712acb45eee32c910ea45172/extensions/rust/syntaxes/rust.tmLanguage.json > ~/.vscode/extensions/matklad.rust-analyzer-0.2.352/rust.tmGrammar.json 

That just in-place overwrites the grammar that ships with rust-analyzer with the old grammar that ships with VSCode today. I imagine I'll have to run that command whenever rust-analyzer is auto-updated to a new version, but I can live with that.

aloucks

comment created time in 2 days

PullRequestReviewEvent

delete branch benesch/materialize

delete branch : testdrive-grammar

delete time in 2 days

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha a9bbe9b90318dbe52e6a4daac9eb482dfb57c4be

editor/vscode: fix bug in testdrive grammar This fixes a syntax highlighting bug in the testdrive grammar, in which `$` command lines were not properly terminated.

view details

Nikhil Benesch

commit sha cb05bd525d76ddc18deffcbd8f6a16f840a4e913

Merge pull request #4583 from benesch/testdrive-grammar editor/vscode: fix bug in testdrive grammar

view details

push time in 2 days

PR merged MaterializeInc/materialize

editor/vscode: fix bug in testdrive grammar

This fixes a syntax highlighting bug in the testdrive grammar, in which $ command lines were not properly terminated.

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+12 -12

0 comment

1 changed file

benesch

pr closed time in 2 days

issue commentrust-analyzer/rust-analyzer

Latest nightly seems to ignore semantic highlighting settings

if we ship a grammar, a user can't override it with a custom one.

Yeah, unfortunately I just can't use rust-analyzer until this grammar is removed. I know I'm probably weirdly particular, but I can't live with the new grammar, and no amount of tweaks will change that.

Are there any hacks we can use in the meantime to override the grammar? I mean, there must be some sort of extension load order that determines what extension gets to provide the Rust grammar, right?

aloucks

comment created time in 2 days

PR opened MaterializeInc/materialize

editor/vscode: fix bug in testdrive grammar

This fixes a syntax highlighting bug in the testdrive grammar, in which $ command lines were not properly terminated.

+12 -12

0 comment

1 changed file

pr created time in 2 days

create barnchbenesch/materialize

branch : testdrive-grammar

created branch time in 2 days

push eventbenesch/this-week-in-rust

Nikhil Benesch

commit sha a0c8f27d72bede893ae057f9ebac58ce400d4ee8

Add blog post about debugging async generator errors

view details

push time in 2 days

fork benesch/this-week-in-rust

Data for this-week-in-rust.org

fork in 2 days

PullRequestReviewEvent
PullRequestReviewEvent

issue commentMaterializeInc/materialize

Tailed file sources occasionally fail to be initialized with cryptic error messages

Is it really so low that you’d run into it with just a dozen files though? That seems unlikely to me. Certainly possible, but just seems unlikely that the default is 10, or that another process used LIMIT - 2.

I wonder if we should just give up on this notify library. It’s been a constant source of bugs. Polling once a second isn’t so bad and it’s a lot more reliable.

On Thu, Oct 22, 2020 at 2:00 AM Brennan Vincent notifications@github.com wrote:

My guess is that sysctl fs.inotify.max_user_watches was too low. If so, we should check for this and report a friendlier error message.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MaterializeInc/materialize/issues/4567#issuecomment-714250346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXSIEOZOFECIWWLBFH4XDSL7DBPANCNFSM4SZ5N3DQ .

ruchirK

comment created time in 3 days

pull request commentMaterializeInc/materialize

populate a new mz_source_info system table with per partition ingest metrics

Just driving by, but I think you could pull the logger out of https://docs.rs/timely/0.11.1/timely/worker/struct.Worker.html#method.log_register anywhere that you have a scope, if you wanted to avoid the plumbing. Not obviously better, but an option!

elindsey

comment created time in 3 days

create barnchbenesch/materialize

branch : stream

created branch time in 3 days

PullRequestReviewEvent

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 6cf84cec7ebdd4b4a0c856e4f867d8ff58046bba

coord: rename Catalog::{OpStatus -> Event} Event is shorter and more evocative.

view details

Nikhil Benesch

commit sha 4d600b05cdfdb26d0fb7b053cdb5c940aa4be006

coord: unify bootstrap/steady-state catalog reporting The various catalog tables (e.g., mz_databases) need to be updated whenever the catalog state changes. This means once at startup, to make the tables reflect the initial state of the catalog, and again on every catalog operation. Previously, there were two separate code paths for the startup and steady-state updates. This patch unifies those code paths. The key insight is that on startup, we can synthesize a list of events that describes how to take a catalog from empty to its current on-disk state; events that use exactly the same format as the events that are generated for a steady-state catalog update.

view details

Nikhil Benesch

commit sha 831f2993b9b2167952883fa4a279652e223e8940

Merge pull request #4576 from MaterializeInc/cat-coord-simple coord: unify bootstrap/steady-state catalog reporting

view details

push time in 3 days

delete branch MaterializeInc/materialize

delete branch : cat-coord-simple

delete time in 3 days

PR merged MaterializeInc/materialize

coord: unify bootstrap/steady-state catalog reporting

The various catalog tables (e.g., mz_databases) need to be updated whenever the catalog state changes. This means once at startup, to make the tables reflect the initial state of the catalog, and again on every catalog operation.

Previously, there were two separate code paths for the startup and steady-state updates. This patch unifies those code paths. The key insight is that on startup, we can synthesize a list of events that describes how to take a catalog from empty to its current on-disk state; events that use exactly the same format as the events that are generated for a steady-state catalog update.

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+128 -247

0 comment

2 changed files

benesch

pr closed time in 3 days

issue commentrust-analyzer/rust-analyzer

Latest nightly seems to ignore semantic highlighting settings

Whipped that proposal up as a PR here: https://github.com/rust-analyzer/rust-analyzer/pull/6318

aloucks

comment created time in 3 days

PR opened rust-analyzer/rust-analyzer

Don't supply a textmate grammar for Rust

This allows folks to install a different extension to supply the Rust grammar of their choice. Upstream VSCode seems likely to incorporate the grammar that is currently bundled with rust-analzyer as the default grammar anyway.

Fix #6267.

+0 -1040

0 comment

2 changed files

pr created time in 3 days

create barnchbenesch/rust-analyzer

branch : no-tm-grammar

created branch time in 3 days

fork benesch/rust-analyzer

An experimental Rust compiler front-end for IDEs

https://rust-analyzer.github.io/

fork in 3 days

issue commentrust-analyzer/rust-analyzer

Latest nightly seems to ignore semantic highlighting settings

I tried to write an extension to override this (https://marketplace.visualstudio.com/items?itemName=benesch.legacy-rust-syntax) but unfortunately it seems to be impossible to override the textmate grammar that ships with rust-analyzer. At least, I don't know how to do it.

I'd like to propose that rust-analyzer not ship a Rust grammar at all, besides the semantic one. That would allow folks to choose the grammar they want by either installing or not installing @dustypomerleau's Rust Syntax extension. Upstream VSCode seems likely to adopt that as the standard Rust grammar, too. (And if they do, then the extension I created should work to undo that.)

aloucks

comment created time in 3 days

push eventbenesch/legacy-rust-syntax

Nikhil Benesch

commit sha f30fe60ad3ec2d2558a125247e4d5979b79c8c5d

Add a separate "Legacy Rust" language

view details

push time in 3 days

push eventbenesch/legacy-rust-syntax

Nikhil Benesch

commit sha cc903abcd1ad11eb002fdd7a4c805bbc9898ba67

Add a separate "Legacy Rust" language

view details

push time in 3 days

push eventbenesch/legacy-rust-syntax

Nikhil Benesch

commit sha 8a475f3e1769c5b29f76a8b1035dae3d2153c859

Add a separate "Legacy Rust" language

view details

push time in 3 days

push eventbenesch/legacy-rust-syntax

Nikhil Benesch

commit sha 2fa91b095855ee81c69d59b6cd8e648b088bc238

Initial commit

view details

push time in 3 days

create barnchbenesch/legacy-rust-syntax

branch : master

created branch time in 3 days

created repositorybenesch/legacy-rust-syntax

created time in 3 days

PullRequestReviewEvent

Pull request review commentMaterializeInc/materialize

coord: unify bootstrap/steady-state catalog reporting

 where             }         } -        let mut tables_to_report = HashMap::new();-        let mut sources_to_report = HashMap::new();-        let mut views_to_report = HashMap::new();-        let mut sinks_to_report = HashMap::new();-        for (id, oid, name, item) in catalog_entries {-            if let Ok(desc) = item.desc(&name) {-                self.report_column_updates(desc, id, 1).await?;-            }-            match item {-                CatalogItem::Index(index) => {-                    self.report_index_update(id, oid, &index, &name.item, 1)-                        .await-                }-                CatalogItem::Table(_) => {-                    tables_to_report.insert(id, oid);-                }-                CatalogItem::Source(_) => {-                    sources_to_report.insert(id, oid);-                }-                CatalogItem::View(_) => {-                    views_to_report.insert(id, oid);-                }-                CatalogItem::Sink(_) => {-                    sinks_to_report.insert(id, oid);-                }-            }-        }--        // Insert initial named objects into system tables.-        let dbs: Vec<(-            String,-            i64,-            u32,-            Vec<(String, i64, u32, Vec<(String, GlobalId)>)>,-        )> = self-            .catalog-            .databases()-            .map(|(name, database)| {-                (-                    name.to_string(),-                    database.id,-                    database.oid,-                    database-                        .schemas-                        .iter()-                        .map(|(schema_name, schema)| {-                            (-                                schema_name.to_string(),-                                schema.id,-                                schema.oid,-                                schema-                                    .items-                                    .iter()-                                    .map(|(name, id)| (name.clone(), *id))-                                    .collect(),-                            )-                        })-                        .collect(),-                )-            })-            .collect();-        for (database_name, database_id, database_oid, schemas) in dbs {-            self.report_database_update(database_id, database_oid, &database_name, 1)-                .await;--            for (schema_name, schema_id, schema_oid, items) in schemas {-                self.report_schema_update(-                    schema_id,-                    schema_oid,-                    Some(database_id),-                    &schema_name,-                    1,-                )-                .await;--                for (item_name, item_id) in items {-                    if let Some(oid) = tables_to_report.remove(&item_id) {-                        self.report_table_update(item_id, oid, schema_id, &item_name, 1)-                            .await;-                    } else if let Some(oid) = sources_to_report.remove(&item_id) {-                        self.report_source_update(item_id, oid, schema_id, &item_name, 1)-                            .await;-                    } else if let Some(oid) = views_to_report.remove(&item_id) {-                        self.report_view_update(item_id, oid, schema_id, &item_name, 1)-                            .await;-                    } else if let Some(oid) = sinks_to_report.remove(&item_id) {-                        self.report_sink_update(item_id, oid, schema_id, &item_name, 1)-                            .await;-                    }-                }-            }-        }-        let ambient_schemas: Vec<(String, i64, u32, Vec<(String, GlobalId)>)> = self-            .catalog-            .ambient_schemas()-            .map(|(schema_name, schema)| {-                (-                    schema_name.to_string(),-                    schema.id,-                    schema.oid,-                    schema-                        .items-                        .iter()-                        .map(|(name, id)| (name.clone(), *id))-                        .collect(),-                )-            })-            .collect();-        for (schema_name, schema_id, schema_oid, items) in ambient_schemas {-            self.report_schema_update(schema_id, schema_oid, None, &schema_name, 1)-                .await;--            for (item_name, item_id) in items {-                if let Some(oid) = tables_to_report.remove(&item_id) {-                    self.report_table_update(item_id, oid, schema_id, &item_name, 1)-                        .await;-                } else if let Some(oid) = sources_to_report.remove(&item_id) {-                    self.report_source_update(item_id, oid, schema_id, &item_name, 1)-                        .await;-                } else if let Some(oid) = views_to_report.remove(&item_id) {-                    self.report_view_update(item_id, oid, schema_id, &item_name, 1)-                        .await;-                } else if let Some(oid) = sinks_to_report.remove(&item_id) {-                    self.report_sink_update(item_id, oid, schema_id, &item_name, 1)-                        .await;-                }-            }-        }+        self.process_catalog_events(events).await?;

I've got a follow up PR (hopefully) that gets rid of this bootstrap method entirely! 🤞

benesch

comment created time in 3 days

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 4d600b05cdfdb26d0fb7b053cdb5c940aa4be006

coord: unify bootstrap/steady-state catalog reporting The various catalog tables (e.g., mz_databases) need to be updated whenever the catalog state changes. This means once at startup, to make the tables reflect the initial state of the catalog, and again on every catalog operation. Previously, there were two separate code paths for the startup and steady-state updates. This patch unifies those code paths. The key insight is that on startup, we can synthesize a list of events that describes how to take a catalog from empty to its current on-disk state; events that use exactly the same format as the events that are generated for a steady-state catalog update.

view details

push time in 3 days

PR opened MaterializeInc/materialize

Reviewers
coord: unify bootstrap/steady-state catalog reporting

The various catalog tables (e.g., mz_databases) need to be updated whenever the catalog state changes. This means once at startup, to make the tables reflect the initial state of the catalog, and again on every catalog operation.

Previously, there were two separate code paths for the startup and steady-state updates. This patch unifies those code paths. The key insight is that on startup, we can synthesize a list of events that describes how to take a catalog from empty to its current on-disk state; events that use exactly the same format as the events that are generated for a steady-state catalog update.

+125 -245

0 comment

2 changed files

pr created time in 3 days

create barnchMaterializeInc/materialize

branch : cat-coord-simple

created branch time in 3 days

issue openeddocker/compose

run command could converge dependencies

Is your feature request related to a problem? Please describe.

Presently docker-compose run SERVICE COMMAND... will start any dependencies of SERVICE if they are not already started. This is great!

Unfortunately, if you change the configuration of one of these dependencies, docker-compose run SERVICE will not recreate this dependency. This is the opposite of what docker-compose up SERVICE will do in this situation, and was very surprising/confusing behavior to me.

Describe the solution you'd like

Teach docker-compose run SERVICE to recreate any dependencies that have changed ("convergence").

Describe alternatives you've considered

Conditionally adding this behavior behind a --recreate-deps-if-changed flag, or somesuch, would also work!

created time in 3 days

PullRequestReviewEvent

delete branch benesch/materialize

delete branch : chaos-pw

delete time in 4 days

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 2efda0ebadc662095b901b5357e8fd4b8e3c3f93

chaos: fix mysql password The default is correct now!

view details

Nikhil Benesch

commit sha bb5781655df5d5dcd742722516959d79a11254cc

Merge pull request #4561 from benesch/chaos-pw chaos: fix mysql password

view details

push time in 4 days

PR merged MaterializeInc/materialize

chaos: fix mysql password

The default is correct now!

<!-- Reviewable:start -->

This change is <img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/> <!-- Reviewable:end -->

+0 -1

0 comment

1 changed file

benesch

pr closed time in 4 days

issue commentMaterializeInc/materialize

release: load tests need to be divorced from demos etc.

I think I agree with you in general, but I also feel like the featureset is changing so fast that we should be adding new things to the load tests to exercise new functionality?

I think that's exactly what I'm getting at? Like, it is great that we have an actual test/demo of using LATERAL JOINs! And I agree that adding production-y tests like that is worthwhile. But the way the load tests are presently set up are at odds with this goal. The instructions for the billing load test are: make sure the billing demo doesn't crash, and make sure the graphs of Materialize's CPU usage/memory usage look the same or better than the last run. That means any perturbation of the billing demo itself corrupts the evaluation criteria.

There are several options here off the top of my head:

  • After changing a load test harness, backfill the load test results on a few recent releases so that we have a baseline for the next release.

  • Commit to having just independent load test harnesses and basically never change them.

To be honest, I'm not sure how sensitive the load tests are presently to catching anything besides catastrophic regressions? Like, are any of them sensitive enough to surface your recent improvements to TopK memory usage? My guess is not! I think we're going to need much more targeted and isolated benchmarks to squeeze out the noise if the goal is to catch smaller-scale regressions.

benesch

comment created time in 4 days

push eventMaterializeInc/materialize

Nikhil Benesch

commit sha 8a43c3747548673d1e5e65b3006cb14fb0c65d9f

doc/user: polish docs for v0.5.0

view details

Nikhil Benesch

commit sha daf1fb748314ebdfc6b77cf59e7647c1299c2b5b

Merge pull request #4565 from benesch/v050 doc/user: polish docs for v0.5.0

view details

push time in 4 days

push eventbenesch/materialize

Nikhil Benesch

commit sha 8a43c3747548673d1e5e65b3006cb14fb0c65d9f

doc/user: polish docs for v0.5.0

view details

push time in 4 days

PullRequestReviewEvent

push eventbenesch/materialize

Nikhil Benesch

commit sha 8536eb27cf3a718ebfc5a72cdf31795a7acea958

doc/user: polish docs for v0.5.0

view details

push time in 4 days

more