profile
viewpoint

microsoft/verona 2881

Research programming language for concurrent ownership

microsoft/snmalloc 425

Message passing based allocator

librespot-org/spotify-connect-resources 131

A repository to hold any data/stuff related to reversing the Spotify Connect protocol. Mostly just data dumps at the moment, but if you have something to add to it, be it an implementation, information or just another data dump, make a PR and I will add it asap.

plietar/dfuse-tool 49

A set of python scripts to create dfu files and upload them to stm32 boards

egnwd/outgain 10

AI based evolution simulation

plietar/dokku-addons 8

Dooku Addons

plietar/dns-parser 1

The parser of DNS protocol packets in rust

plietar/dokku 1

Docker powered mini-Heroku in less than 100 lines of Bash

lanejr/outgain 0

AI based evolution simulation

pull request commentmicrosoft/verona

Update LLVM and use new ODS Type definitions

I can't seem to get your scripts to work.

The whole http://lab.llvm.org:8011/json API has been giving me 404s for the past few days. Not sure if that's a transient issue on LLVM's side, or if the API got removed.

plietar

comment created time in 2 days

issue commentmicrosoft/verona

SSA Variables in types and polymorphism

Here's a very crude proof of concept for what this could look like: https://github.com/plietar/verona/commit/4ada0eb57e73223eb28bbcc8c7be4d326edd2fc6.

There's an example code in https://github.com/plietar/verona/blob/4ada0eb57e73223eb28bbcc8c7be4d326edd2fc6/poc.mlir:

    verona.method @hello {
      [%X] {
        %Array = verona.class_type @Array[%X]
        %sig = verona.signature (%X) -> %Array
        verona.yield %sig
      }
      [%X](%value) {
        verona.typecheck %value : %X
        %Array = verona.class_type @Array[%X]
        %result = verona.static_call %Array "create" ()
        verona.typecheck %result : %Array
        verona.call %result "push_back" (%value)
        verona.return %result
      }
    }

At the MLIR level, there are only two very simple types, !verona.type (ie. MetaType) and !verona.value (ie. ValueType). Every method has two regions, one that "computes" the signature and one that contains the actual body.

Types can be computed using a couple of special operations, such as class_type, unit_type and signature. Additionally regions can be parametric over types by specifying them in square brackets. [%X](%value) is special syntax for (%X: !verona.type, %value: !verona.value).

The signature region "returns" the computed signature using the verona.yield operation. We should ban any non-type level computation (not the case right now).

The body region also makes use of the type bindings:

  • It can call a static method on a !verona.type
  • It can pass a !verona.type as parameters to other methods (not demonstrated yet).
  • It can annotate a !verona.value with a !verona.type, using the typecheck operation. These operations would likely all be inserted by the type inference phase. Later phases uses these annotations to look up the type of values.
plietar

comment created time in 3 days

create barnchplietar/verona

branch : new-types

created branch time in 3 days

issue commentmicrosoft/verona

SSA Variables in types and polymorphism

Somewhat relevant discussion, especially when it comes to tying values with types within the body of methods: https://llvm.discourse.group/t/using-a-dialect-to-represent-analysis-information/1545/3

plietar

comment created time in 4 days

issue commentmicrosoft/verona

SSA Variables in types and polymorphism

@mjp41 Sorry I screwed up the last two lines, should have been

  (%Z: !verona.type) {
    %arg = %Z
    %ret = verona.array<%Z>
    %I = verona.interface @I
    %c = verona.constraint %Z : %I
    verona.signature (%arg) : %ret where %c
  }

this is expressing a function foo[Z](a: Z): array[Z] where Z: I. The last line is actually a block terminator, rather than constructing an SSA value and using it in a return terminator.

build a complex form, and destruct it to do the checking at call sites.

Pretty much yeah. I think this will be a lot easier to manipulate than a sequence of subtype assertions. The latter gives me an uncomfortable SMT solver feeling.

plietar

comment created time in 4 days

issue commentmicrosoft/verona

SSA Variables in types and polymorphism

Then we would need conditions on the type parameters too.

Yeah, we'd also have a region per type-parameter, to describe the constraint.

As a somewhat (but maybe nicer / more scalable) equivalent formulation, we could have a single region that returns the whole signature:

verona.method "foo"
  (%Z: !verona.type) { // Signature
    %arg = %Z
    %ret = verona.array<%Z>
    %I = verona.interface @I
    %c = verona.constraint %ret : %I
    verona.signature (%arg) : %I where %c
  }
  (%Z: !verona.type, %a: !verona.expr) : !verona.expr { ... } // Body
plietar

comment created time in 4 days

issue commentmicrosoft/verona

SSA Variables in types and polymorphism

MLIR actually allows multiple "regions" (ie. blocks of code, which sadly name-clashes with Verona regions 😞 ) to be attached to a single node.

We could have one region per argument, one region for the return type, and finally one region for the body. The "type expression" regions would (probably) only receive the type parameters.

verona.method "foo"
  (%Z: !verona.type) : !verona.type { return %Z } // First argument type
  (%Z: !verona.type) : !verona.type { // Return type
     %ret = verona.array<%Z>
     return %ret
  }
  (%Z: !verona.type, %a: !verona.expr) : !verona.expr { ... } // Body

This is a bit more "declarative" than @mjp41's requires/ensures block, in that we actually describe how to compute the type, rather than allow arbitrary assertions.


As a side-note, MLIR recently gained support for regions that don't represent a control-flow, but instead allow arbitrary, possibly cyclic, data-flow graph. This could potentially allow mutually recursive types to be described.

plietar

comment created time in 4 days

Pull request review commentmicrosoft/verona

Update LLVM and use new ODS Type definitions

 class Verona_Op<string mnemonic, list<OpTrait> traits = []> :     let verifier = [{ return ::verify(*this); }]; } +class Verona_TypeDef<string name> : TypeDef<Verona_Dialect, name> { }

Sorry I don’t really understand the question. The switch to ODS does not affect the implementation of isaVeronaType, which is roughly how we define what a “Verona Type”.

To be honest I don’t really know what impact the dialect we pass as the first argument of TypeDef has. I’ll try and dig into that.

plietar

comment created time in 5 days

PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Update LLVM and use new ODS Type definitions

  #pragma once -#include "dialect/VeronaTypes.h" #include "mlir/IR/Dialect.h" -namespace mlir::verona-{+// This line must come after "normal" includes.

Sorry this is referenced in the commit message but not in the PR description.

Yes, all the namespace stuff is required by changes in MLIR. TableGen now wraps the generated code with namespace blocks, using whatever cppNamespace we define in the .td.

It’s possible we could leave cppNamespace empty, I’m not sure, but that would certainly not be the intended use case.

Cf. https://reviews.llvm.org/D86811

plietar

comment created time in 5 days

PullRequestReviewEvent

pull request commentmicrosoft/verona

Update LLVM and use new ODS Type definitions

@rengolin This needs an update of the prebuilt LLVM blob. I don't know how to do this (and probably wouldn't have permission to anyway). Would you mind taking care of it please?

I don't know what your process is for picking a commit version. I just use whatever what HEAD when I did the update, ie. https://github.com/llvm/llvm-project/commit/73811d32c72d0760c8c2066e4675dd6f1a7bbef7.

plietar

comment created time in 6 days

pull request commentmicrosoft/verona

Update LLVM and use new ODS Type definitions

As a side-effect of the LLVM update, I think we could now use a namespace that is not a descendant of mlir, eg. verona::dialect or verona::compiler (though that's what the old compiler uses). This has the benefit (or disadvantage, depending on how you look at it) of requiring uses of mlir types to be qualified everywhere eg. mlir::Type.

I'll leave that to another PR, if this is even something we want to do.

plietar

comment created time in 6 days

PR opened microsoft/verona

Update LLVM and use new ODS Type definitions

This cuts down on a lot of boilerplate around the definition of types, accessors and storage classes.

Because of complications due to its recursive structure, we can't use ODS with ClassType yet. We may be able to use ODS with a custom storage class and extra class declarations, but it leads to a messy Frankenstein definition, so it is easier to keep it all in C++.

Similarly, we can't use the parsing/printing code ODS generates, because our implementations need extra context about recursive classes (ie. the class_stack fields of TypePrinter and TypeParser).

Fixes #331

+195 -340

0 comment

13 changed files

pr created time in 6 days

create barnchplietar/verona

branch : llvm-next

created branch time in 6 days

issue commentmicrosoft/verona

New MLIR ODS type declaration

I think this will help with reducing the boilerplate involved in the definition of types.

I’m not sure we can use the print and parsing “dispatch” functionality, because our print and parse have an extra “stack” argument, used to handle recursive classes.

Like you’ve mentioned, it won’t help us when it comes to defining type rules for operations.

rengolin

comment created time in 10 days

Pull request review commentmicrosoft/verona

Clangformat use git ls-files.

 macro(clangformat_targets)     message(WARNING "Not generating clangformat target, no clang-format tool found")   else ()     message(STATUS "Generating clangformat target using ${CLANG_FORMAT}")-    file(GLOB_RECURSE ALL_SOURCE_FILES src/*.cc src/*.h src/*.hh)+    find_package(Git)+    execute_process(+      COMMAND ${GIT_EXECUTABLE} ls-files *.cc *.h *.hh

Last time I tried to run clangformat on the td files it worked very badly.

mjp41

comment created time in 16 days

PullRequestReviewEvent

issue commentmicrosoft/verona

Implement BasicBlock arguments

FWIW, the old compiler, in which we implemented that paper, already uses block arguments rather than phi nodes.

Each BB has a list of parameters: https://github.com/microsoft/verona/blob/master/src/compiler/ir/ir.h#L304 And the terminators have a list of variables to pass onto the target: https://github.com/microsoft/verona/blob/master/src/compiler/ir/ir.h#L95

See for example: https://github.com/microsoft/verona/blob/master/testsuite/ir/compile-pass/loop/Main.test3.ir.txt BB1 has one argument, and the jump from BB0 and BB2 pass in a value.

The comments and variable names do mention “phi nodes”, because I was originally going to use phi nodes, but later found block arguments to be a nicer representation.

we can't have pending basic blocks (we need to add instructions to them).

I don’t understand why not? A pending basic block is simply one for which more predecessors may be added later in the construction.

rengolin

comment created time in 16 days

delete branch plietar/verona

delete branch : vm-protect

delete time in 16 days

push eventmicrosoft/verona

Paul Liétar

commit sha 2989943fc96bfc2541de0f245409cef1656d2dff

Clarify parameter names of the shadow stack API. Using `root` as the parameter name was confusing, since that was actually referring to the "permanent" root of the region, ie. its entrypoint, rather than the root being added/removed. Renamed to `entry`, and made the semantics of each parameter clearer in the methods' comments.

view details

Paul Liétar

commit sha d0c96af161cd1e94b9384774f1468de180b31fe7

Switch the value stack from a vector to a deque. When a register is cleared, this may cause a re-entrant execution on the VM is order to run the finaliser. This could lead to stack growing and being relocated in the middle of the Value::clear call, giving very surprising results. By making it a deque, we ensure growing the stack never causes Value objects to move, keeping their address stable even in the event of a re-entrant call.

view details

Paul Liétar

commit sha da1e09a1a912fc522abffe15cda5c5d07d62faa8

Add support for register lists in the bytecode.

view details

Paul Liétar

commit sha b30f0b5e4849cb6d8fa53172f0544936fef2d950

Allow clearing many registers at once. When variables go out of scope, the compiler emits Clear instructions for each of them, which can become very verbose in run logs. By having a separate opcode which accepts an arbitrary number of registers we can reduce the noise (and shrink the bytecode size, although that wasn't the main goal).

view details

Paul Liétar

commit sha 8c821e818a3ee1769b6daecfac9b11ddc9cadbff

Use the new ValueList type in the print opcode. The encoding is actually the same (8 bit length followed by registers), but it hides these details behind the new abstraction.

view details

Paul Liétar

commit sha 039c59555eefeeafad67be493252b8089004e9bb

Add protect and unprotect opcodes to the VM. These prevent objects from being collected, by adding them to their respective regions' shadow stacks.

view details

Paul Liétar

commit sha 174b73a8387beac52237b02d016be92be94403de

Use liveness to insert Protect/Unprotect calls. When a method is called, it is surrounded by a Protect / Unprotect pair in order to prevent any live variables from being garbage collected.

view details

Paul Liétar

commit sha b657cbadb69a697e18e9e268315b23ebcc654e1d

Fix compilation on GCC

view details

Paul Liétar

commit sha 26d6fc1c63766a0cde89edca90f325a632335776

Stop treating the stack as a contiguous array. With the switch to a deque, the stack is not a contiguous array anymore. Avoid direct indexing and pointer arithmetic and instead use the existing accessor functions.

view details

Paul Liétar

commit sha 2b6916d3af6dbff8b73b6acb33b5b6fd8dfffade

Clarify the unowned cown API. Rather than having methods consume_cown and switch_to_cown_body that change the tag of the Value, we use return new values that have the right tag.

view details

push time in 16 days

PR merged microsoft/verona

Add support for the shadow root stack in the VM

The VM gains two new opcodes, PROTECT and UNPROTECT, which respectively add and remove objects from the shadow root stack of the runtime.

The compiler uses liveness information to insert the operations around every method call. Note that this is very conservative. The compiler makes no attempt to guess whether a reference is in fact in the same region as one of the arguments or not. It just assumes it could and protects everything.

Couple of cleanups and fixes:

  • Clarify parameter names of the shadow stack API.
  • Fix a bug caused by finalisers, where the value stack grows at unexpected times.
  • Add proper support for opcodes with a variable number of arguments. The Print opcode already did this, but in a very ad-hoc way.
+559 -129

0 comment

15 changed files

plietar

pr closed time in 16 days

issue commentmicrosoft/verona

Free variable analysis / BasicBlock arguments

What exactly do you mean by “free variable analysis”? Is it SSA construction, such as in https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf?

rengolin

comment created time in 16 days

push eventplietar/verona

Paul Liétar

commit sha 8399908473f9840dc2090b483dc8de1ebbbf93d0

Clarify the unowned cown API. Rather than having methods consume_cown and switch_to_cown_body that change the tag of the Value, we use return new values that have the right tag.

view details

push time in 16 days

push eventplietar/verona

Paul Liétar

commit sha 7f16395cf221b3a68be34d7918ca991bfa3963d0

Stop treating the stack as a contiguous array. With the switch to a deque, the stack is not a contiguous array anymore. Avoid direct indexing and pointer arithmetic and instead use the existing accessor functions.

view details

Paul Liétar

commit sha 9bcd1677790ac589943bbf643889a1ca2b5ef8aa

Clarify the unowned cown API. Rather than having methods consume_cown and switch_to_cown_body that change the tag of the Value, we use return new values that have the right tag.

view details

push time in 16 days

pull request commentmicrosoft/verona

Changes how classes are represented: MLIR modules

I don't think adding methods to types would make any sense.

Why does it make more sense than adding fields to them? How will you type check method calls? eg.

  verona.call "foo"[%1 : !verona.class<"A">](): !verona.class<"B">

This needs to look at the list of methods that the class A has to find one called "foo", and make sure that method return a (subtype of) B.

If you can solve the problem of checking method calls, then you can solve the problem of checking fields in the exact same way, by encoding the list of fields in the class operation/module.


I'm happy to go ahead and just use modules, but I'm 90% sure we'll go full circle and add a special class operation in a few weeks.

rengolin

comment created time in 17 days

pull request commentmicrosoft/verona

Changes how classes are represented: MLIR modules

Modules are barely a "native MLIR" construct. They are just an operation defined by the standard dialect, and is no more special than an operation we define ourself. To cite the documentation of ModuleOp:

ModuleOp represents a module, or an operation containing one region with a single block containing opaque operations.

There is no reason why we couldn't put functions inside a verona.class operation. It would have been, for the most part, equivalent to a module, but with added flexibilty. Of course we can store everything in a module and put a bunch of stuff into attributes and call it a day, but MLIR is intentionally designed to make new operations cheap to define, so why not take advantage of it?

The reason I had duplicated fields in both the operation and the type is that I needed to look up the type of fields when type checking a field read/write, and the semantics of referencing an operation from within a type is not very well defined in MLIR. I do not see how methods will be any simpler, and if we solve the problem for methods then we can apply that solution for fields, and the class type can return to being a flat identifier, with no "keeping in sync" problem.

rengolin

comment created time in 17 days

pull request commentmicrosoft/verona

Changes how classes are represented: MLIR modules

While I'm not against this change, I have to ask, how is this any different from the old verona.class operation that we removed?

rengolin

comment created time in 17 days

push eventplietar/verona

Paul Liétar

commit sha 2767ecea878747e0ee9e007fda34ad48da57ed1f

Fix compilation on GCC

view details

push time in 17 days

PR opened microsoft/verona

Add support for the shadow root stack in the VM

The VM gains two new opcodes, PROTECT and UNPROTECT, which respectively add and remove objects from the shadow root stack of the runtime.

The compiler uses liveness information to insert the operations around every method call. Note that this is very conservative. The compiler makes no attempt to guess whether a reference is in fact in the same region as one of the arguments or not. It just assumes it does.

Couple of cleanups and fixes:

  • Clarify parameter names of the shadow stack API.
  • Fix a bug caused by finalisers, where the value stack grows at unexpected times.
  • Add proper support for opcodes with a variable number of arguments. The Print opcode already did this, but in a very ad-hoc way.
+444 -59

0 comment

14 changed files

pr created time in 17 days

push eventplietar/verona

Paul Liétar

commit sha 16e5235caf33e1bd5dc34c60ec2928cf4ead48a9

Add support for register lists in the bytecode.

view details

Paul Liétar

commit sha e00dbd8a95240638783929c5c90cc0011fec2eb9

Allow clearing many registers at once. When variables go out of scope, the compiler emits Clear instructions for each of them, which can become very verbose in run logs. By having a separate opcode which accepts an arbitrary number of registers we can reduce the noise (and shrink the bytecode size, although that wasn't the main goal).

view details

Paul Liétar

commit sha 4b58e022c61dded7e585717213355755c0cae4c7

Use the new ValueList type in the print opcode. The encoding is actually the same (8 bit length followed by registers), but it hides these details behind the new abstraction.

view details

Paul Liétar

commit sha 9c6d7f0c8ad4314f7b24ae936287e2a9f2bca58f

Add protect and unprotect opcodes to the VM. These prevent objects from being collected, by adding them to their respective regions' shadow stacks.

view details

Paul Liétar

commit sha c54658792c9d058f250438dabc2e2be07fd23ab1

Use liveness to insert Protect/Unprotect calls. When a method is called, it is surrounded by a Protect / Unprotect pair in order to prevent any live variables from being garbage collected.

view details

push time in 17 days

create barnchplietar/verona

branch : vm-protect

created branch time in 17 days

pull request commentmicrosoft/verona

WIP: Initial implementation of the region checker.

@plietar, this review is a month old and very outdated.

Sorry I've gone on a few tangents over the past few weeks. I do intend to make progress at some point, it's just taken some time.

Also, it seems we're moving towards a region check pass to happen at the same time (or even on the same) type checking pass.

I don't think there's any proper consensus on this. I have a pretty clear idea of how I'd do things as separate passes, just need more work and time to implement. The best we have for a combined type- and region-checking is hand waving examples.

plietar

comment created time in 18 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona    ClassType::FieldsRef ClassType::getFields() const   {-    assert(getImpl()->isInitialized);-    return getImpl()->fields;+    // We may not have a full declaration available

I'm quite unhappy with the notion of "incomplete types". We do not want to follow the C++ model of forward declarations at all. If a type doesn't have a definition it can't be used, end of it.

Having some boilerplate empty definitions class I32 {} at the top of the test files is IMO a better workaround for the lack of modules/includes than introducing broken concepts into the compiler.

rengolin

comment created time in 18 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     struct JoinTypeStorage;     struct IntegerTypeStorage;     struct FloatTypeStorage;+    struct DescriptorTypeStorage;     struct CapabilityTypeStorage;     struct ClassTypeStorage;     struct ViewpointTypeStorage;   } +  /**+   * Meet types are intersections between types (A & B).

I don't know if David's "new word" was referring to Meet or to Intersection.

I chose meet/join over intersection/union for a few reasons, none of which are particularily strong:

  • Union is a C++ keyword
  • Meet and Join are short, and of the same length
  • Intersection is hard to type
  • (Maybe the most valid reason) there are various interpretations of the words "union" and "intersection", used in different languages. There's the C/C++/Rust unions, there's discriminated unions (ie. Haskell like data types), there's the "implictly projected pairs of values" intersections, and there's the Scala-like intersection. None of these have the Pony/Verona semantics. On the other hand, Meet and Join have, I believe, precise and unambiguous semantics. They are respectively the greatest lower bound and least-upper bound in the subtyping lattice.
rengolin

comment created time in 18 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     return getImpl()->elements;   } -  IntegerType IntegerType::get(MLIRContext* ctx, size_t width, unsigned sign)-  {-    return Base::get(ctx, width, sign);-  }--  size_t IntegerType::getWidth() const-  {-    return getImpl()->width;-  }--  bool IntegerType::getSign() const-  {-    return getImpl()->sign;-  }--  FloatType FloatType::get(MLIRContext* ctx, size_t width)-  {-    return Base::get(ctx, width);-  }--  size_t FloatType::getWidth() const+  UnknownType UnknownType::get(MLIRContext* ctx)   {-    return getImpl()->width;+    return ::mlir::detail::TypeUniquer::get<UnknownType>(ctx);   } -  BoolType BoolType::get(MLIRContext* ctx)+  StaticClassType StaticClassType::get(MLIRContext* ctx, Type descriptor)   {-    return ::mlir::detail::TypeUniquer::get<BoolType>(ctx);+    return Base::get(ctx, descriptor);   } -  UnknownType UnknownType::get(MLIRContext* ctx)+  Type StaticClassType::getTypes() const

This shouldn't be in the plural form.

rengolin

comment created time in 18 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona::detail     }   }; +  struct DescriptorTypeStorage : public TypeStorage+  {+    Type descriptor;

Sorry I didn't mean renaming the struct, but the field and getter. I actually liked DescriptorType.

rengolin

comment created time in 18 days

PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     llvm::ArrayRef<mlir::Type> getElements() const;   }; -  struct IntegerType-  : public Type::TypeBase<IntegerType, Type, detail::IntegerTypeStorage>-  {-    using Base::Base;--    static IntegerType get(MLIRContext* context, size_t width, unsigned sign);--    size_t getWidth() const;-    bool getSign() const;-  };--  struct FloatType-  : public Type::TypeBase<FloatType, Type, detail::FloatTypeStorage>-  {-    using Base::Base;--    static FloatType get(MLIRContext* context, size_t width);--    size_t getWidth() const;-  };--  struct BoolType : public Type::TypeBase<BoolType, Type, TypeStorage>+  /**+   * Unknown types are derived types from operations that cannot define the type+   * at lowering stage, but will later be replaced by other types during type+   * inference. Not all types will be concrete by then, but none of them should+   * be unknown.+   */+  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>   {     using Base::Base; -    static BoolType get(MLIRContext* context);+    static UnknownType get(MLIRContext* context);   }; -  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>+  /**+   * A class descriptor type, used for access to static members of the class,+   * including fields and methods. The type pointed to could be unknown before+   * type inference, but once known, should point to a fully qualified class.

It depends by what you mean by "unknown". I am intepreting it as "an instance of UnknownType", which are holes that the type inference pass will fill.

Generic type parameters, such as X, are different. These will still be present after type inference, and up until reification, if reification happens.

rengolin

comment created time in 18 days

PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     llvm::ArrayRef<mlir::Type> getElements() const;   }; -  struct IntegerType-  : public Type::TypeBase<IntegerType, Type, detail::IntegerTypeStorage>-  {-    using Base::Base;--    static IntegerType get(MLIRContext* context, size_t width, unsigned sign);--    size_t getWidth() const;-    bool getSign() const;-  };--  struct FloatType-  : public Type::TypeBase<FloatType, Type, detail::FloatTypeStorage>-  {-    using Base::Base;--    static FloatType get(MLIRContext* context, size_t width);--    size_t getWidth() const;-  };--  struct BoolType : public Type::TypeBase<BoolType, Type, TypeStorage>+  /**+   * Unknown types are derived types from operations that cannot define the type+   * at lowering stage, but will later be replaced by other types during type+   * inference. Not all types will be concrete by then, but none of them should+   * be unknown.+   */+  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>   {     using Base::Base; -    static BoolType get(MLIRContext* context);+    static UnknownType get(MLIRContext* context);   }; -  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>+  /**+   * A class descriptor type, used for access to static members of the class,+   * including fields and methods. The type pointed to could be unknown before+   * type inference, but once known, should point to a fully qualified class.+   */+  struct DescriptorType+  : public Type::TypeBase<DescriptorType, Type, detail::DescriptorTypeStorage>   {     using Base::Base; -    static UnknownType get(MLIRContext* context);+    static DescriptorType get(MLIRContext* context, Type ref);+    TypeRange getTypes() const;   }; +  /**+   * Capability types represents traits that other types possess.+   *+   * Isolated: An entry point to a new region, can only have one reference to+   * it at any time (ownership model). Can also be created when returning a+   * sub-region that needs to be isolated.+   *+   * Mutable/Immutable: Read-Write/Read-Only variables and viewpoints.
  • Capabilities aren't traits over other types, they are properties of individual references eg. I can have a String & imm somewhere and a String & mut somewhere else.
  • There can be more than one reference to the entrypoint, but only one of them can be Isolated ie. the others must be Mutable.
  • I don't understand what "Can also be created when returning a sub-region that needs to be isolated." means
  • Immutable is a stronger property than Read-Only. It guarantees that no mutable aliases to that object exist anywhere else. We have been considering adding a separate read-only capability for some time, although it is not clear yet if it is needed (Pony has both imm and readonly).
  • I'm not sure what "and viewpoints" is describing
rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     llvm::ArrayRef<mlir::Type> getElements() const;   }; -  struct IntegerType-  : public Type::TypeBase<IntegerType, Type, detail::IntegerTypeStorage>-  {-    using Base::Base;--    static IntegerType get(MLIRContext* context, size_t width, unsigned sign);--    size_t getWidth() const;-    bool getSign() const;-  };--  struct FloatType-  : public Type::TypeBase<FloatType, Type, detail::FloatTypeStorage>-  {-    using Base::Base;--    static FloatType get(MLIRContext* context, size_t width);--    size_t getWidth() const;-  };--  struct BoolType : public Type::TypeBase<BoolType, Type, TypeStorage>+  /**+   * Unknown types are derived types from operations that cannot define the type+   * at lowering stage, but will later be replaced by other types during type+   * inference. Not all types will be concrete by then, but none of them should

There's many interpretations of the word "concrete". What did you mean here?

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 def Verona_FieldWriteOp : Verona_Op<"field_write", [ Typecheckable ]> {     }]; } +def Verona_CallOp : Verona_Op<"call"> {+    let summary = "Dynamic call to a class method";+    let description = [{+      The `$object` refers to the variable which represents the instance of a+      class, which has a method called `$name` with arguments (`self`, `$args`).++      Before type inference, the `$object` and most `$args` will probably have+      Verona.Uknown type, so this can't be a standard MLIR call.++      After type inference and name mangling for classes and methods, this could+      move to a simple call to the precise mangled name with the correct+      argument types.+    }];++    let arguments = (ins+      Verona_Type : $object,+      StrAttr : $name,+      Variadic<Verona_Type> : $args+    );+    let results = (outs Verona_Type : $res);++    let assemblyFormat = [{+        $name `[` $object `:` type($object) `]` `(` $args `,` type($args) `)` attr-dict `:` type($res)
        $name `[` $object `:` type($object) `]` `(` $args `:` type($args) `)` attr-dict `:` type($res)
rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     llvm::ArrayRef<mlir::Type> getElements() const;   }; -  struct IntegerType-  : public Type::TypeBase<IntegerType, Type, detail::IntegerTypeStorage>-  {-    using Base::Base;--    static IntegerType get(MLIRContext* context, size_t width, unsigned sign);--    size_t getWidth() const;-    bool getSign() const;-  };--  struct FloatType-  : public Type::TypeBase<FloatType, Type, detail::FloatTypeStorage>-  {-    using Base::Base;--    static FloatType get(MLIRContext* context, size_t width);--    size_t getWidth() const;-  };--  struct BoolType : public Type::TypeBase<BoolType, Type, TypeStorage>+  /**+   * Unknown types are derived types from operations that cannot define the type+   * at lowering stage, but will later be replaced by other types during type+   * inference. Not all types will be concrete by then, but none of them should+   * be unknown.+   */+  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>   {     using Base::Base; -    static BoolType get(MLIRContext* context);+    static UnknownType get(MLIRContext* context);   }; -  struct UnknownType : public Type::TypeBase<UnknownType, Type, TypeStorage>+  /**+   * A class descriptor type, used for access to static members of the class,+   * including fields and methods. The type pointed to could be unknown before+   * type inference, but once known, should point to a fully qualified class.

The type pointed to could be unknown

I actually think this is one of the few places in the IR where we never have unknown types. No harm in allowing it though.

should point to a fully qualified class.

Once we have generics (in some form or another), it will be legal to have a descriptor<typaram<X>> imo, since we'll need to allow static methods on type parameters (eg. X.foo() is allowed).

I think there are some restrictions on what can be used here, but it's not really clear to me yet. Until then I'd just leave it out of the comment.

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona::detail     }   }; +  struct DescriptorTypeStorage : public TypeStorage+  {+    Type descriptor;

I don't really like the name descriptor. That type isn't the descriptor, it is the type being described. Maybe described_type or underlying_type?

Whatever you pick, could you make the getter's name in DescriptorType match.

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona::detail     }   }; +  struct DescriptorTypeStorage : public TypeStorage+  {+    Type descriptor;++    // width

Left over comment?

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona    ClassType::FieldsRef ClassType::getFields() const   {-    assert(getImpl()->isInitialized);-    return getImpl()->fields;+    // We may not have a full declaration available

Is there a use for that? In my opinion looking up fields of an incomplete type is a compiler bug and we should crash.

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     return getImpl()->elements;   } -  IntegerType IntegerType::get(MLIRContext* ctx, size_t width, unsigned sign)-  {-    return Base::get(ctx, width, sign);-  }--  size_t IntegerType::getWidth() const-  {-    return getImpl()->width;-  }--  bool IntegerType::getSign() const-  {-    return getImpl()->sign;-  }--  FloatType FloatType::get(MLIRContext* ctx, size_t width)-  {-    return Base::get(ctx, width);-  }--  size_t FloatType::getWidth() const+  UnknownType UnknownType::get(MLIRContext* ctx)   {-    return getImpl()->width;+    return ::mlir::detail::TypeUniquer::get<UnknownType>(ctx);   } -  BoolType BoolType::get(MLIRContext* ctx)+  DescriptorType DescriptorType::get(MLIRContext* ctx, Type descriptor)   {-    return ::mlir::detail::TypeUniquer::get<BoolType>(ctx);+    return Base::get(ctx, descriptor);   } -  UnknownType UnknownType::get(MLIRContext* ctx)+  TypeRange DescriptorType::getTypes() const

Why is this a TypeRange, rather than a single Type?.

rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     llvm::ArrayRef<mlir::Type> getElements() const;   }; +  /**+   * Meet types are unions between types (A | B).
   * Join types are unions between types (A | B).
rengolin

comment created time in 21 days

Pull request review commentmicrosoft/verona

Refactor class/type lowering, add static calls

 namespace mlir::verona     Type getFieldType(StringRef name) const;   }; +  /**+   * Viewpoint is a view of a type through another type.+   *+   * For examples, storing to a mutable type through an imutable viewpoint+   * is not allowed.

This isn't really the purpose of viewpoint types. It is more about combining the type of a field with the type of the reference it was read from.

For instance, reading a mut field from a imm object gives you a viewpoint<mut, imm> = imm reference.

rengolin

comment created time in 21 days

PullRequestReviewEvent
PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Add additional roots to traced region

 namespace verona::rt       }     } +    /// Add root to the stack.+    /// Preserves for object for a GC.+    static void push_additional_root(Object* root, Object* o, Alloc* alloc)+    {+      RegionTrace* reg = get(root);+      reg->additional_entry_points.push(o, alloc);+    }++    /// Remove root to the stack.+    /// Must be called in reservse order with respect to push_additional_root.

reservse -> reverse

mjp41

comment created time in 22 days

Pull request review commentmicrosoft/verona

Add additional roots to traced region

 namespace memory_gc     alloc_in_region<Cx, Fx, Cx, Fx>(alloc, o); // unreachable      // Run a GC.-    assert(Region::debug_size(o) == 13);+    check(Region::debug_size(o) == 13);     RegionTrace::gc(alloc, o);-    assert(Region::debug_size(o) == 9);+    check(Region::debug_size(o) == 9);      // Swap root, but this creates garbage.     RegionTrace::swap_root(o, nroot);-    assert(!o->debug_is_iso() && nroot->debug_is_iso());+    check(!o->debug_is_iso() && nroot->debug_is_iso());      // Run another GC.     RegionTrace::gc(alloc, nroot);-    assert(Region::debug_size(nroot) == 4);+    check(Region::debug_size(nroot) == 4);      // Create another region.     o = new (alloc) Cx;     o->f1 = new (alloc, o) Fx;     o->f1->c1 = new (alloc, o) Cx;     auto* nnroot = new (alloc, o) Cx;     o->f1->c2 = nnroot;-    assert(Region::debug_size(o) == 4);+    check(Region::debug_size(o) == 4);      // Merge the regions.     RegionTrace::merge(alloc, nroot, o);     nroot->c2 = o;-    assert(Region::debug_size(nroot) == 8);+    check(Region::debug_size(nroot) == 8);      // Swap root again.     RegionTrace::swap_root(nroot, nnroot);-    assert(!nroot->debug_is_iso() && nnroot->debug_is_iso());+    check(!nroot->debug_is_iso() && nnroot->debug_is_iso());      // Run another GC.-    assert(Region::debug_size(nnroot) == 8);+    check(Region::debug_size(nnroot) == 8);     RegionTrace::gc(alloc, nnroot);-    assert(Region::debug_size(nnroot) == 1);+    check(Region::debug_size(nnroot) == 1);      Region::release(alloc, nnroot);     snmalloc::current_alloc_pool()->debug_check_empty();   } +  void test_additional_roots()

Could you extend this test / add another one that checks that objects reachable from an additional root (but which aren't roots themselves) are preserved. It would make sure we not only mark the roots, but we trace them too.

mjp41

comment created time in 22 days

Pull request review commentmicrosoft/verona

Add additional roots to traced region

 namespace verona::rt       }     } +    /// Add root to the stack.+    /// Preserves for object for a GC.+    static void push_additional_root(Object* root, Object* o, Alloc* alloc)+    {+      RegionTrace* reg = get(root);+      reg->additional_entry_points.push(o, alloc);+    }++    /// Remove root to the stack.

to -> from

mjp41

comment created time in 22 days

PullRequestReviewEvent
PullRequestReviewEvent

issue commentmicrosoft/verona

Lower class to class types

Yes, that’s pretty much it except from the fact that the result of the static operation can’t be class<C>, but something slightly different, in order to distinguish between instances of the class and the descriptor of the class.

In terms of descriptor/vtable layout, I do think we want static methods in the vtable anyway, in order to allow static method calls on values, ie. an interface has a static method, but the call is issued on a specific instance of the class that implements that interface.

interface I {
  apply();
}

foo(x: I & mut) {
  x.apply() // Dynamic dispatch to a static method
}

I would make it so that descriptors have a layout that is akin to immutable objects embedded in the .text section (or .rodata maybe), ie. they have their own “descriptor” pointer. That pointer could very well be pointing back to the descriptor itself for example. This way the semantics of call is identical regardless of whether the method is static or not: find the descriptor at a fixed offset from the receiver, load the function pointer from that descriptor, call the function.

There is one potential downside, which is that in order to make static and non-static methods “ABI equivalent”, static methods must take a useless receiver (the pointer to the descriptor), which in a naive implementation (ie. the old compiler) wastes an argument register. There are probably ways we can avoid this.

In practice however, these calls will generally be optimized and dispatched statically anyway, at least until we add the non monomorphized generics of course.

rengolin

comment created time in 22 days

issue commentmicrosoft/verona

Lower class to class types

While I think having call and static_call operations is a good design, and have no particular issue with it, there is another design which is worth considering, and which I had used in the old compiler.

A static call is decomposed into two operations, verona.static and the usual verona.call operation. The first operation takes a type and encodes it into a runtime value. In the bytecode, this creates a pointer to the type’s descriptor. The result of verona.static T has type verona.static<T> (where T is probably restricted to class types).

A verona.static<T> has all the methods of T that are static: https://github.com/microsoft/verona/blob/master/src/compiler/resolution.cc#L645

See for example the calls to Main.use in https://github.com/microsoft/verona/blob/master/testsuite/ir/compile-pass/loop/Main.test1.ir.txt.

One of the benefits of this representation is that it enables static calls on type parameters, eg X.foo(), to be implemented without monomorphization. During lowering, type parameters are transformed into extra runtime parameters that are these pointers to descriptors.

I’m happy to ignore this design completely, or consider it later in the process if/when we add non-monomorphized lowering of generics.

rengolin

comment created time in 23 days

delete branch plietar/verona

delete branch : stack_refactor

delete time in 25 days

fork plietar/qtbase

Qt Base (Core, Gui, Widgets, Network, ...)

fork in 25 days

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Stack refactor

  namespace verona::rt {+  /**+   * This class contains the core functionality for a stack using aligned blocks+   * of memory. The stack is the size of a single pointer when empty.+   */   template<class T, class Alloc>-  class Stack+  class StackThin   {   private:-    static constexpr size_t STACK_COUNT = 63;+    static constexpr size_t POINTER_COUNT = 64;+    static_assert(+      snmalloc::bits::next_pow2_const(POINTER_COUNT) == POINTER_COUNT,+      "Should be power of 2 for alignment.");++    static constexpr size_t STACK_COUNT = POINTER_COUNT - 1; -    struct Block+    /**+     * The assumes that the allocations are aligned to the same threshold as+     * their size. The blocks contain one previous pointer, and 63 pointers to+     * Ts.  This is a power of two, so we can use the bottom part of the+     * pointer to track the index.+     *+     * As the block contains a previous pointer, there are only 64 possible+     * states for a block, that is 0 - 63 live entries.+     *+     * The stack is represented by a single interior pointer, index, of type+     * T**.+     *+     * Note that `index` can point to a `prev` element of a block,+     * and thus be mistyped. This represents the empty block and is never+     * followed directly.+     */+  public:+    struct alignas(POINTER_COUNT * sizeof(T*)) Block     {-      Block* prev;-      T data[STACK_COUNT];+      T** prev;+      T* data[STACK_COUNT];     }; -    Alloc* alloc;-    Block* block;-    Block* backup;-    size_t index;+  private:+    static_assert(+      sizeof(Block) == alignof(Block), "Size and align must be equal");++    // Dummy block to effectively allow pointer arithmetic on nullptr+    // which is undefined behaviour.  So we statically allocate a block+    // to represent the end of the stack.+    inline static Block null_block{};++    // Index of the full dummy block+    // Due to pointer arithmetic with nullptr being undefined behaviour+    // we use a statically allocated null block.+    static constexpr T** null_index = &(null_block.data[STACK_COUNT - 1]);++    /// Mask to access the index component of the pointer to a block.+    static constexpr uintptr_t INDEX_MASK = (POINTER_COUNT - 1) * sizeof(T*);++    /// Pointer into a block.  As the blocks are strongly aligned+    /// the bits 9-3 represent the element in the block, with 0 being+    /// a pointer to the `prev` pointer, and implying the empty block.+    T** index;++    /// Takes an index and returns the pointer to the Block+    static Block* get_block(T** ptr)+    {+      return snmalloc::pointer_align_down<sizeof(Block), Block>(ptr);+    }++    /// Checks if an index into a block means the block is empty.+    static bool is_empty(T** ptr)+    {+      return ((uintptr_t)ptr & INDEX_MASK) == 0;+    }++    /// Checks if an index into a block means the block has no space.+    static bool is_full(T** index)+    {+      return ((uintptr_t)index & INDEX_MASK) == INDEX_MASK;+    }    public:-    Stack(Alloc* alloc)-    : alloc(alloc), block(nullptr), backup(nullptr), index(STACK_COUNT)-    {}+    StackThin() : index(null_index)+    {+      static_assert(+        sizeof(*this) == sizeof(void*),+        "Stack should contain only the index pointer");+    } -    ~Stack()+    /// Deallocate the linked blocks for this stack.+    void dealloc(Alloc* alloc)     {-      auto local_block = block;-      while (local_block)+      auto local_block = get_block(index);+      while (local_block != &null_block)       {-        auto prev = local_block->prev;+        auto prev = get_block(local_block->prev);         alloc->template dealloc<sizeof(Block)>(local_block);         local_block = prev;       }-      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);     } +    /// returns true if this stack is empty     ALWAYSINLINE bool empty()     {-      return block == nullptr;+      return index == null_index;     } -    ALWAYSINLINE T peek()+    /// Return the top element of the stack without removing it.+    ALWAYSINLINE T* peek()     {       assert(!empty());-      return block->data[index - 1];+      return *index;     } -    ALWAYSINLINE T pop()+    /// Call this to pop an element from the stack.+    ALWAYSINLINE T* pop(Alloc* alloc)     {       assert(!empty());+      if (!is_empty(index - 1))+      {+        auto item = peek();+        index--;+        return item;+      } -      index--;-      T item = block->data[index];--      if (index == 0)-        pop_slow_path();--      return item;+      return pop_slow(alloc);     } -    ALWAYSINLINE void push(T item)+    /// Call this to push an element onto the stack.+    ALWAYSINLINE void push(T* item, Alloc* alloc)     {-      if (index < STACK_COUNT)+      if (!is_full(index))       {-        block->data[index] = item;         index++;+        *index = item;+        return;       }-      else-      {-        push_slow_path(item);-      }++      push_slow(item, alloc);     } -    template<void apply(T t)>+    /// For all elements of the stack+    template<void apply(T* t)>     void forall()     {-      Block* curr = block;-      size_t i = index;+      T* curr = index; -      while (curr != nullptr)+      while (curr != null_index)       {         do         {-          i--;-          apply(curr->data[i]);-        } while (i > 0);+          apply(*curr);+          curr--;+        } while (is_empty(curr)); -        curr = curr->prev;-        i = STACK_COUNT;+        curr = get_block(curr)->prev;       }     }    private:-    void pop_slow_path()+    /// Slow path for push, performs a push, when allocation is required.+    void push_slow(T* item, Alloc* alloc)     {-      Block* prev = block->prev;+      assert(is_full(index)); -      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);+      Block* block = (Block*)alloc->template alloc<sizeof(Block)>();+      T** iter = (T**)block;+      assert(is_empty(iter));+      auto next = get_block(iter); -      backup = block;-      block = prev;-      index = STACK_COUNT;+      assert(index != (T**)&null_block);+      next->prev = index;+      index = &(next->data[0]);+      next->data[0] = item;

I've just realized block and next are the same, so actually this could be simplified quite a bit.

Block* next = (Block*)alloc->template alloc<sizeof(Block)>();
next->prev = index;
next->data[0] = item;
index = &(next->data[0]);
mjp41

comment created time in a month

Pull request review commentmicrosoft/verona

Stack refactor

  namespace verona::rt {+  /**+   * This class contains the core functionality for a stack using aligned blocks+   * of memory. It is not expecting to be used directly, but for one of its+   * wrappers below to be used which correctly handle allocation.+   */   template<class T, class Alloc>-  class Stack+  class StackBase   {   private:     static constexpr size_t STACK_COUNT = 63; -    struct Block+    /**+     * The assumes that the allocations are aligned to the same threshold as+     * their size The blocks contain one previous pointer, and 63 pointers to+     * Ts.  This is a pointer of two, so we can use the bottom part of the+     * pointer to track the index.+     *+     * As the block contains a previous pointer, there are only 64 possible+     * states for a block, that is 0 - 63 live entries.+     *+     * The stack is represented by a single interior pointer, index, of type+     * T**.+     *+     * Note that `index` can point to a `prev` element of a block,+     * and thus be mistyped. This represents the empty block and is never+     * followed directly.+     */+  public:+    struct alignas((STACK_COUNT + 1) * sizeof(T*)) Block     {-      Block* prev;-      T data[STACK_COUNT];+      T** prev;+      T* data[STACK_COUNT];     }; -    Alloc* alloc;-    Block* block;-    Block* backup;-    size_t index;+    inline static Block null_block{};++    // Index of a full block allocated+    // Due to pointer arithmetic with nullptr being undefined behaviour+    // we use a statically allocated null block.+    static constexpr T** null_index = &(null_block.data[STACK_COUNT - 1]);++  private:+    /// Mask to access the index component of the pointer to a block.+    static constexpr uintptr_t INDEX_MASK = STACK_COUNT * sizeof(T*);++    /// Pointer into a block.  As the blocks are strongly aligned+    /// the bits 9-3 represent the element in the block, with 0 being+    /// a pointer to the `prev` pointer, and implying the empty block.+    T** index;++  private:+    /// Takes an index and returns the pointer to the Block+    static Block* get_block(T** ptr)+    {+      return snmalloc::pointer_align_down<sizeof(Block), Block>(ptr);+    }++    /// Checks if an index into a block means the block is empty.+    static bool is_empty(T** ptr)+    {+      return ((uintptr_t)ptr & INDEX_MASK) == 0;+    }++    /// Checks if an index into a block means the block has space.+    static bool is_not_full(T** index)+    {+      return ((uintptr_t)index & INDEX_MASK) != INDEX_MASK;+    }    public:-    Stack(Alloc* alloc)-    : alloc(alloc), block(nullptr), backup(nullptr), index(STACK_COUNT)-    {}+    StackBase() : index(null_index) {} -    ~Stack()+    /// Deallocate the linked blocks for this stack.+    void dealloc(Alloc* alloc)     {-      auto local_block = block;-      while (local_block)+      auto local_block = get_block(index);+      while (local_block != &null_block)       {-        auto prev = local_block->prev;+        auto prev = get_block(local_block->prev);         alloc->template dealloc<sizeof(Block)>(local_block);         local_block = prev;       }-      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);     } +    /// returns true if this stack is empty     ALWAYSINLINE bool empty()     {-      return block == nullptr;+      return index == null_index;     } -    ALWAYSINLINE T peek()+    /// Return the top element of the stack without removing it.+    ALWAYSINLINE T* peek()     {       assert(!empty());-      return block->data[index - 1];+      return *index;     } -    ALWAYSINLINE T pop()+    /// Call this to determine if pop can proceed without deallocation+    ALWAYSINLINE bool pop_is_fast()     {       assert(!empty());+      return !is_empty(index - 1);+    } +    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// true.+    ALWAYSINLINE T* pop_fast()+    {+      assert(pop_is_fast());+      auto item = peek();       index--;-      T item = block->data[index];+      return item;+    } -      if (index == 0)-        pop_slow_path();+    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// false.  This returns a pair of the popped element+    /// and the block that the client must dispose of.+    std::pair<T*, Block*> pop_slow()+    {+      auto item = peek();+      T** prev_index = get_block(index)->prev;+      auto dealloc = get_block(index);+      index = prev_index;+      return {item, dealloc};+    } -      return item;+    /// Call this to determine if push can proceed without+    /// a new block.+    ALWAYSINLINE bool push_is_fast()+    {+      return is_not_full(index);     } -    ALWAYSINLINE void push(T item)+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// true.+    ALWAYSINLINE void push_fast(T* item)     {-      if (index < STACK_COUNT)-      {-        block->data[index] = item;-        index++;-      }-      else-      {-        push_slow_path(item);-      }+      assert(push_is_fast());+      index++;+      *index = item;     } -    template<void apply(T t)>+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// false.  It needs to be provided a new block of memory+    /// for the stack to use.+    void push_slow(T* item, Block* block)+    {+      assert(!push_is_fast());++      T** iter = (T**)block;+      assert(is_empty(iter));+      auto next = get_block(iter);++      assert(index != (T**)&null_block);+      next->prev = index;+      index = &(next->data[0]);+      next->data[0] = item;+    }++    /// For all elements of the stack+    template<void apply(T* t)>     void forall()     {-      Block* curr = block;-      size_t i = index;+      T* curr = index; -      while (curr != nullptr)+      while (curr != null_index)       {         do         {-          i--;-          apply(curr->data[i]);-        } while (i > 0);+          apply(*curr);+          curr--;+        } while (is_empty(curr)); -        curr = curr->prev;-        i = STACK_COUNT;+        curr = get_block(curr)->prev;       }     }+  }; -  private:-    void pop_slow_path()+  /**+   * This class uses the block structured stack with extra fields+   * for the allocator, and a backup block, so that the common case+   * of 0-1 elements can be fast, and any other block boundrary case.+   */+  template<class T, class Alloc>+  class Stack

This is what I had in mind: https://github.com/plietar/verona/commit/948431ee917da593582d1a0f43d3448d9f4dd7c3

As far as I can tell it behaves identically, and it keeps the control-flow more contralized.

mjp41

comment created time in a month

PullRequestReviewEvent

create barnchplietar/verona

branch : stack_refactor

created branch time in a month

Pull request review commentmicrosoft/verona

Stack refactor

  namespace verona::rt {+  /**+   * This class contains the core functionality for a stack using aligned blocks+   * of memory. It is not expecting to be used directly, but for one of its+   * wrappers below to be used which correctly handle allocation.+   */   template<class T, class Alloc>-  class Stack+  class StackBase   {   private:     static constexpr size_t STACK_COUNT = 63; -    struct Block+    /**+     * The assumes that the allocations are aligned to the same threshold as+     * their size The blocks contain one previous pointer, and 63 pointers to+     * Ts.  This is a pointer of two, so we can use the bottom part of the+     * pointer to track the index.+     *+     * As the block contains a previous pointer, there are only 64 possible+     * states for a block, that is 0 - 63 live entries.+     *+     * The stack is represented by a single interior pointer, index, of type+     * T**.+     *+     * Note that `index` can point to a `prev` element of a block,+     * and thus be mistyped. This represents the empty block and is never+     * followed directly.+     */+  public:+    struct alignas((STACK_COUNT + 1) * sizeof(T*)) Block     {-      Block* prev;-      T data[STACK_COUNT];+      T** prev;+      T* data[STACK_COUNT];     }; -    Alloc* alloc;-    Block* block;-    Block* backup;-    size_t index;+    inline static Block null_block{};++    // Index of a full block allocated+    // Due to pointer arithmetic with nullptr being undefined behaviour+    // we use a statically allocated null block.+    static constexpr T** null_index = &(null_block.data[STACK_COUNT - 1]);++  private:+    /// Mask to access the index component of the pointer to a block.+    static constexpr uintptr_t INDEX_MASK = STACK_COUNT * sizeof(T*);++    /// Pointer into a block.  As the blocks are strongly aligned+    /// the bits 9-3 represent the element in the block, with 0 being+    /// a pointer to the `prev` pointer, and implying the empty block.+    T** index;++  private:+    /// Takes an index and returns the pointer to the Block+    static Block* get_block(T** ptr)+    {+      return snmalloc::pointer_align_down<sizeof(Block), Block>(ptr);+    }++    /// Checks if an index into a block means the block is empty.+    static bool is_empty(T** ptr)+    {+      return ((uintptr_t)ptr & INDEX_MASK) == 0;+    }++    /// Checks if an index into a block means the block has space.+    static bool is_not_full(T** index)+    {+      return ((uintptr_t)index & INDEX_MASK) != INDEX_MASK;+    }    public:-    Stack(Alloc* alloc)-    : alloc(alloc), block(nullptr), backup(nullptr), index(STACK_COUNT)-    {}+    StackBase() : index(null_index) {} -    ~Stack()+    /// Deallocate the linked blocks for this stack.+    void dealloc(Alloc* alloc)     {-      auto local_block = block;-      while (local_block)+      auto local_block = get_block(index);+      while (local_block != &null_block)       {-        auto prev = local_block->prev;+        auto prev = get_block(local_block->prev);         alloc->template dealloc<sizeof(Block)>(local_block);         local_block = prev;       }-      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);     } +    /// returns true if this stack is empty     ALWAYSINLINE bool empty()     {-      return block == nullptr;+      return index == null_index;     } -    ALWAYSINLINE T peek()+    /// Return the top element of the stack without removing it.+    ALWAYSINLINE T* peek()     {       assert(!empty());-      return block->data[index - 1];+      return *index;     } -    ALWAYSINLINE T pop()+    /// Call this to determine if pop can proceed without deallocation+    ALWAYSINLINE bool pop_is_fast()     {       assert(!empty());+      return !is_empty(index - 1);+    } +    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// true.+    ALWAYSINLINE T* pop_fast()+    {+      assert(pop_is_fast());+      auto item = peek();       index--;-      T item = block->data[index];+      return item;+    } -      if (index == 0)-        pop_slow_path();+    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// false.  This returns a pair of the popped element+    /// and the block that the client must dispose of.+    std::pair<T*, Block*> pop_slow()+    {+      auto item = peek();+      T** prev_index = get_block(index)->prev;+      auto dealloc = get_block(index);+      index = prev_index;+      return {item, dealloc};+    } -      return item;+    /// Call this to determine if push can proceed without+    /// a new block.+    ALWAYSINLINE bool push_is_fast()+    {+      return is_not_full(index);     } -    ALWAYSINLINE void push(T item)+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// true.+    ALWAYSINLINE void push_fast(T* item)     {-      if (index < STACK_COUNT)-      {-        block->data[index] = item;-        index++;-      }-      else-      {-        push_slow_path(item);-      }+      assert(push_is_fast());+      index++;+      *index = item;     } -    template<void apply(T t)>+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// false.  It needs to be provided a new block of memory+    /// for the stack to use.+    void push_slow(T* item, Block* block)+    {+      assert(!push_is_fast());++      T** iter = (T**)block;+      assert(is_empty(iter));+      auto next = get_block(iter);++      assert(index != (T**)&null_block);+      next->prev = index;+      index = &(next->data[0]);+      next->data[0] = item;+    }++    /// For all elements of the stack+    template<void apply(T* t)>     void forall()     {-      Block* curr = block;-      size_t i = index;+      T* curr = index; -      while (curr != nullptr)+      while (curr != null_index)       {         do         {-          i--;-          apply(curr->data[i]);-        } while (i > 0);+          apply(*curr);+          curr--;+        } while (is_empty(curr)); -        curr = curr->prev;-        i = STACK_COUNT;+        curr = get_block(curr)->prev;       }     }+  }; -  private:-    void pop_slow_path()+  /**+   * This class uses the block structured stack with extra fields+   * for the allocator, and a backup block, so that the common case+   * of 0-1 elements can be fast, and any other block boundrary case.+   */+  template<class T, class Alloc>+  class Stack

With David's BackupAlloc you should be able to get rid of the distinction between StackSmall and StackBase.

Stack<Alloc> has a field BackupAlloc<Alloc> and StackSmall<BackupAlloc<Alloc>>, and it's implementation is limited to just augmenting the method calls with the extra BackupAlloc argument. I think that would also reduce the complexity of slow/fast path handling.

mjp41

comment created time in a month

PullRequestReviewEvent

Pull request review commentmicrosoft/verona

Stack refactor

  namespace verona::rt {+  /**+   * This class contains the core functionality for a stack using aligned blocks+   * of memory. It is not expecting to be used directly, but for one of its+   * wrappers below to be used which correctly handle allocation.+   */   template<class T, class Alloc>-  class Stack+  class StackBase   {   private:     static constexpr size_t STACK_COUNT = 63; -    struct Block+    /**+     * The assumes that the allocations are aligned to the same threshold as+     * their size The blocks contain one previous pointer, and 63 pointers to+     * Ts.  This is a pointer of two, so we can use the bottom part of the+     * pointer to track the index.+     *+     * As the block contains a previous pointer, there are only 64 possible+     * states for a block, that is 0 - 63 live entries.+     *+     * The stack is represented by a single interior pointer, index, of type+     * T**.+     *+     * Note that `index` can point to a `prev` element of a block,+     * and thus be mistyped. This represents the empty block and is never+     * followed directly.+     */+  public:+    struct alignas((STACK_COUNT + 1) * sizeof(T*)) Block     {-      Block* prev;-      T data[STACK_COUNT];+      T** prev;+      T* data[STACK_COUNT];     }; -    Alloc* alloc;-    Block* block;-    Block* backup;-    size_t index;+    inline static Block null_block{};++    // Index of a full block allocated+    // Due to pointer arithmetic with nullptr being undefined behaviour+    // we use a statically allocated null block.+    static constexpr T** null_index = &(null_block.data[STACK_COUNT - 1]);++  private:+    /// Mask to access the index component of the pointer to a block.+    static constexpr uintptr_t INDEX_MASK = STACK_COUNT * sizeof(T*);++    /// Pointer into a block.  As the blocks are strongly aligned+    /// the bits 9-3 represent the element in the block, with 0 being+    /// a pointer to the `prev` pointer, and implying the empty block.+    T** index;++  private:+    /// Takes an index and returns the pointer to the Block+    static Block* get_block(T** ptr)+    {+      return snmalloc::pointer_align_down<sizeof(Block), Block>(ptr);+    }++    /// Checks if an index into a block means the block is empty.+    static bool is_empty(T** ptr)+    {+      return ((uintptr_t)ptr & INDEX_MASK) == 0;+    }++    /// Checks if an index into a block means the block has space.+    static bool is_not_full(T** index)+    {+      return ((uintptr_t)index & INDEX_MASK) != INDEX_MASK;+    }    public:-    Stack(Alloc* alloc)-    : alloc(alloc), block(nullptr), backup(nullptr), index(STACK_COUNT)-    {}+    StackBase() : index(null_index) {} -    ~Stack()+    /// Deallocate the linked blocks for this stack.+    void dealloc(Alloc* alloc)     {-      auto local_block = block;-      while (local_block)+      auto local_block = get_block(index);+      while (local_block != &null_block)       {-        auto prev = local_block->prev;+        auto prev = get_block(local_block->prev);         alloc->template dealloc<sizeof(Block)>(local_block);         local_block = prev;       }-      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);     } +    /// returns true if this stack is empty     ALWAYSINLINE bool empty()     {-      return block == nullptr;+      return index == null_index;     } -    ALWAYSINLINE T peek()+    /// Return the top element of the stack without removing it.+    ALWAYSINLINE T* peek()     {       assert(!empty());-      return block->data[index - 1];+      return *index;     } -    ALWAYSINLINE T pop()+    /// Call this to determine if pop can proceed without deallocation+    ALWAYSINLINE bool pop_is_fast()     {       assert(!empty());+      return !is_empty(index - 1);+    } +    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// true.+    ALWAYSINLINE T* pop_fast()+    {+      assert(pop_is_fast());+      auto item = peek();       index--;-      T item = block->data[index];+      return item;+    } -      if (index == 0)-        pop_slow_path();+    /// Call this to pop an element from the stack.  Only+    /// correct to call this if pop_is_fast just returned+    /// false.  This returns a pair of the popped element+    /// and the block that the client must dispose of.+    std::pair<T*, Block*> pop_slow()+    {+      auto item = peek();+      T** prev_index = get_block(index)->prev;+      auto dealloc = get_block(index);+      index = prev_index;+      return {item, dealloc};+    } -      return item;+    /// Call this to determine if push can proceed without+    /// a new block.+    ALWAYSINLINE bool push_is_fast()+    {+      return is_not_full(index);     } -    ALWAYSINLINE void push(T item)+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// true.+    ALWAYSINLINE void push_fast(T* item)     {-      if (index < STACK_COUNT)-      {-        block->data[index] = item;-        index++;-      }-      else-      {-        push_slow_path(item);-      }+      assert(push_is_fast());+      index++;+      *index = item;     } -    template<void apply(T t)>+    /// Call this to push an element onto the stack.  Only+    /// correct to call this if push_is_fast just returned+    /// false.  It needs to be provided a new block of memory+    /// for the stack to use.+    void push_slow(T* item, Block* block)+    {+      assert(!push_is_fast());++      T** iter = (T**)block;+      assert(is_empty(iter));+      auto next = get_block(iter);++      assert(index != (T**)&null_block);+      next->prev = index;+      index = &(next->data[0]);+      next->data[0] = item;+    }++    /// For all elements of the stack+    template<void apply(T* t)>     void forall()     {-      Block* curr = block;-      size_t i = index;+      T* curr = index; -      while (curr != nullptr)+      while (curr != null_index)       {         do         {-          i--;-          apply(curr->data[i]);-        } while (i > 0);+          apply(*curr);+          curr--;+        } while (is_empty(curr)); -        curr = curr->prev;-        i = STACK_COUNT;+        curr = get_block(curr)->prev;       }     }+  }; -  private:-    void pop_slow_path()+  /**+   * This class uses the block structured stack with extra fields+   * for the allocator, and a backup block, so that the common case+   * of 0-1 elements can be fast, and any other block boundrary case.+   */+  template<class T, class Alloc>+  class Stack+  {+    using Block = typename StackBase<T, Alloc>::Block;+    StackBase<T, Alloc> stack;+    Alloc* alloc;+    Block* backup;++  public:+    Stack(Alloc* alloc) : alloc(alloc), backup(nullptr) {}++    ALWAYSINLINE void push(T* item)     {-      Block* prev = block->prev;+      if (stack.push_is_fast())+      {+        stack.push_fast(item);+        return;+      } -      if (backup != nullptr)-        alloc->template dealloc<sizeof(Block)>(backup);+      Block* new_block = backup;++      if (new_block == nullptr)+        new_block = (Block*)alloc->template alloc<sizeof(Block)>();++      backup = nullptr; -      backup = block;-      block = prev;-      index = STACK_COUNT;+      stack.push_slow(item, new_block);     } -    void push_slow_path(T item)+    T* pop()     {-      Block* next;+      if (stack.pop_is_fast())+      {+        return stack.pop_fast();+      } -      if (backup != nullptr)+      auto res_block = stack.pop_slow();+      if (backup == nullptr)       {-        next = backup;-        backup = nullptr;+        backup = res_block.second;       }       else       {-        next = (Block*)alloc->template alloc<sizeof(Block)>();+        alloc->template dealloc<sizeof(Block)>(res_block.second);       }+      return res_block.first;+    } -      index = 1;-      next->data[0] = item;-      next->prev = block;-      block = next;+    ~Stack()+    {+      stack.dealloc(alloc);+      if (backup != nullptr)+      {+        alloc->template dealloc<sizeof(Block)>(backup);+      }+    }++    bool empty()+    {+      return stack.empty();+    }++    T* peek()+    {+      return stack.peek();+    }+  };++  /**+   * Block structured stack, that is a single pointer in size.+   *+   * Operations require an explicit Alloc parameter in case more/less is+   * required.+   */+  template<class T, class Alloc>+  class StackSmall

More bikeshedding, I'm not a fan of "small" because it reminds me of the SmallVec/SmallSet found in LLVM and other projects, but small has a completely different meaning there.

How about "ThinStack", from the thin/fat pointer terminology?

mjp41

comment created time in a month

PullRequestReviewEvent

pull request commentmicrosoft/verona

More class operations lowered from Verona

Thanks for the verona.unknown change.

Thinking more about it, one issue I have with the class type vs the operation is that types are interned, immutable and pointed at throughout the IR.

This means that if a pass needs to modify the structure or details of a class (and I can imagine a number use cases for this), it would have to walk through the entire IR to update pointer to the new version of the class. Unlike MLIR values, there isn’t even an easy way to iterate over all uses of a type to replace them.

To make things even worse, because of the way I implemented recursive classes, they are interned by their name rather than their contents. This means every pass that modifies classes has to come up with a new name for them.

rengolin

comment created time in a month

pull request commentmicrosoft/verona

Add unknown type to dialect, implement `tidy` and `drop`.

Could we use verona.unknown rather than the abbreviated unk form?

rengolin

comment created time in a month

PR closed microsoft/verona

Add flags to help with writing negative tests

Two flags are added to the verona-mlir binary to help with writing negative tests, where we expect an error to be raised.

--verify-diagnostics allows error messages to be written inline within MLIR files. The compiler succeeds only if all errors listed in the file get emitted.

--split-input-file allows for a single input file to be split into pieces, where each piece is treated independently. This works around the fact that MLIR verifiers abort on the first error they find. By splitting the input file we can include multiple related test cases and expected errors in a single file.


Based on some discussion we've had today, we discussed the fact that this may not be the way we want to proceed. However I had the PR almost ready, so I figured I may put in out for others to see.

Alternative options include:

  1. Using FileCheck/OutputCheck to check for diagnostics. This is what the old compiler did.
  2. Configure the test pipeline to split the input before feeding it to the compiler.
  3. Use two separate binaries, a fully configurable verona-opt which would include these two (and more, such as a comfigurable pipeline), and a end-user binary.

I've realized since discussing these options that

  1. gives us poorer integration with the compiler, eg. FileCheck will only check that all expected diagnostics are emitted. It won't complain if the compiler emits a diagnostic that was not expected by the test case
  2. will cause line numbers to be mumbled up, which could cause confusion. The --split-input-file tries to handles this, although it could probably do a better job at it (eg. an error at line 50, with a split at 44 is rendered as dialect.mlir split at line #44:6:52: error: [...]
+248 -101

10 comments

6 changed files

plietar

pr closed time in a month

push eventplietar/verona

Theo Butler

commit sha 26f908a6099a9efcfda6ebe6e692823183ddada5

Fix log message for rescheduled cown (#294) This PR removes prevents a log message in the scheduler from attempting to show an invalid epoch mark in the case that the cown popped from the scheduler queue is the token.

view details

Theo Butler

commit sha 3c558af995653aa6671563d58d088e0dab2d3c47

Remove bit fields from cown status (#293) This PR replaces the use of bit fields for tracking cown status. The added separation of the `_load_hist` and `_current_load` fields prevents the clang SLP vectorizer from optimizing the `total_load()` operation into AVX2 instructions. Based on some profiling of the backpressure system, this optimization has a negligible impact on overall runtime/backpressure performance over the non-AVX2 version.

view details

Theo Butler

commit sha f102a809799a3bd517a75d9885849d28457cab3f

Add coin to overload cowns in systematic testing (#292) This PR adds a coin flip to artificially overload a running cown before each behaviour step.

view details

Paul Liétar

commit sha 29bc83d6b5003c1b51c982512c38a977ef80e448

Add flags to help with writing negative tests Two flags are added to the verona-mlir binary to help with writing negative tests, where we expect an error to be raised. --verify-diagnostics allows error messages to be written inline within MLIR files. The compiler succeeds only if all errors listed in the file get emitted. --split-input-file allows for a single input file to be split into pieces, where each piece is treated independently. This works around the fact that MLIR verifiers abort on the first error they find. By splitting the input file we can include multiple related test cases and expected errors in a single file.

view details

Paul Liétar

commit sha 545ea6c5794de7e04e04338de514eb9b5254bee5

Example use of expected errors.

view details

Paul Liétar

commit sha 04a87025762b62579f3195a73fbd924d768c5520

Remove left-over code.

view details

Paul Liétar

commit sha eac71b30a03aac5c6d0d15a12a916892a00b4c0e

Add a proper negative test

view details

Paul Liétar

commit sha 4da295853aca218bcc3851e11dbe42a64966540d

Fix a couple bugs.

view details

Paul Liétar

commit sha e79f3c00ac05b19ed938d2f33436fab64983694f

clang-format

view details

Paul Liétar

commit sha 5290b861783b75b02f0bbfb320227b2b2d28f845

Fix dialect.mlir test case

view details

Paul Liétar

commit sha 97fbf8d42a357e5b13124a7d0ee45a5126cffbcc

Make Driver use just llvm::Error again.

view details

Paul Liétar

commit sha 4e65112cc5b366379b2a5b8465a69d0aba6cc25d

CR

view details

Paul Liétar

commit sha 8050de94935e180a72c5c59a347863c4f5753877

Remove useless `check` variable.

view details

push time in a month

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

Errors from the generator are not being picked by the diagnostic handler

They never were.

nor being printed by the main file any more.

They are. They get embedded in an llvm::Error and printed by the logAllUnhandledErrors call in processVeronaInput. The readAST flow has barely changed.

LogicalResult is still being used in addition to the two diagnostic handlers and llvm::Error.

Not in the driver. In verona-mlir.cc we can't get around using LogicalResult for processMLIRBuffer if we want to use mlir::splitAndProcessBuffer. I guess openInput and others could use llvm::Error, but I figured it was simpler to stick to LogicalResult for now, given this isn't intended to be reusable.

llvm::Error is being used as a gimmick for LogicalResult.

I don't understand what that means. llvm::Error is used in exactly the same way as before.


Are you planning to add diagnostic handlers to the generator yourself soon?

No. I'm trying to add minimum set of features I need so I can better test other pieces (eg. typechecking and #278). I'm not going to design and implement a whole diagnostic infrastructure when it is a significant amount of work that is low priority and unrelated to what I'm doing. Until someone else does that, I consider a solution that only covers the MLIR pipeline better than nothing.

I thought this would have been a reasonably straightforward change. Admittedly my first version was very much a mess, but I've done my best to fix it since and touch as little as possible. I don't understand what you want me to do now, apart from designing and implementing the whole diagnostics and error handling. I'll hapilly close the PR if you believe we can't add this simple feature without first redoing everything.

plietar

comment created time in a month

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

To be in a strictly better situation, we would need to replace the AST handling to the diagnostic handler before this move.

I'd love to be able to use -verify-diagnostics with Verona source code, but that is a significant change that I'm not willing do to in this PR.

We (probably) can't use the MLIR diagnostic handler to emit errors in the Generator. What we really need is our own version of a diagnostic handler (eg. src/compiler/source_manager.h), and reimplement the verify diagnostic mechanism.

But your PR would make it worse

I don't really see how it makes things worse. We already had an MLIR diagnostic handler that just printed everything to stderr before. The flag just helps use a different one.

plietar

comment created time in a month

push eventplietar/verona

Paul Liétar

commit sha c9d8df51867098e2145fd8e91e0e79e51e8ca620

CR

view details

push time in a month

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

Alright, I've switched the Driver to only use llvm::Error.

verona-mlir still makes heavy use of LogicalResult, because it is required by the API of mlir::splitAndProcessBuffer. Unless we rewrite that function to use something else, there isn't much we can do about it.

In "verify diagnostics" mode, there is a case where we need to ignore the error result, which I do with:

handleAllErrors(std::move(error), [](const llvm::ErrorInfoBase& e) {})

It's not super elegant since it explicitely works around a feature of llvm::Error, but at least avoids the mess in the driver.

@rengolin does this look more acceptable to you?

plietar

comment created time in a month

push eventplietar/verona

Paul Liétar

commit sha 697343ed614d116bb0215e9e9a501e1c4274cf83

Make Driver use just llvm::Error again.

view details

push time in a month

push eventplietar/verona

Theo Butler

commit sha 815f8d1681c16e45e9b469100ac3a5b7292e087c

Show object ID in systematic logs (#289) * Show object ID in systematic logs All systematic logs would previously show the address of objects, instead of the object identifier generated under systematic testing. This PR adds an overload for the stream operator so that `const Object*` is displayed as the ID under systematic testing. * show object id in interpreter

view details

Matthew Parkinson

commit sha 1a69b97f6fe29c32b48dcd6fecc2912a94404df8

Minor refactor Common pattern of declaring and then looking up a function merged into a single operation.

view details

Matthew Parkinson

commit sha 8b524394b4ac144738fcb39269f150aeb9b4f71f

Make `next` call at end of `for` loop Changes for loop from ``` while ($iter..has_value()) val = $iter.apply() $iter.next() [body] ``` to ``` while ($iter..has_value()) val = $iter.apply() [body] $iter.next() ``` There is additional complexity to deal with continue, must call `next`.

view details

Paul Liétar

commit sha 9669fd6a566179f3d1f9b54f437e46b1d1569749

Add flags to help with writing negative tests Two flags are added to the verona-mlir binary to help with writing negative tests, where we expect an error to be raised. --verify-diagnostics allows error messages to be written inline within MLIR files. The compiler succeeds only if all errors listed in the file get emitted. --split-input-file allows for a single input file to be split into pieces, where each piece is treated independently. This works around the fact that MLIR verifiers abort on the first error they find. By splitting the input file we can include multiple related test cases and expected errors in a single file.

view details

Paul Liétar

commit sha aa3d0ef40ed8252b83de598accbf2621ddea5013

Example use of expected errors.

view details

Paul Liétar

commit sha f55a13c1a7346293e3a2d2de01b44534882c2d2a

Remove left-over code.

view details

Paul Liétar

commit sha 71b6f8363ec6d49f9076be9dacc710e3ce892b25

Add a proper negative test

view details

Paul Liétar

commit sha 2db160aab2666ab327be8948973d0214c2fd0938

Fix a couple bugs.

view details

Paul Liétar

commit sha d44eadecaea1930c7e74372028cbab6684d43ef5

clang-format

view details

Paul Liétar

commit sha dfecfe4a518b374965bad37610a20d45d257f312

Fix dialect.mlir test case

view details

Paul Liétar

commit sha a0b866044264d8f79f67b70710d6e6e80b268fbc

Make Driver use just llvm::Error again.

view details

push time in a month

push eventplietar/verona

Paul Liétar

commit sha 03869f841e86f79daec71798c7a48380056be726

Fix dialect.mlir test case

view details

push time in a month

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

Is the --split-input-file illustrated in this PR? Shouldn't you have somewhere in the test?

I've added a proper example in testsuite/mlir/mlir-fail/subtyping.mlir

plietar

comment created time in a month

push eventplietar/verona

Paul Liétar

commit sha d96a4c274167fee8e9053519d98c190992032d88

Add a proper negative test

view details

Paul Liétar

commit sha ed5eaac81d0c80f97bfdf4ea4d08b82897b5ac84

Fix a couple bugs.

view details

Paul Liétar

commit sha 5e4588228e55b1e4fa1bed8f9e62826f465de52f

clang-format

view details

Paul Liétar

commit sha b28af61cd47238256d1f44442826640311029738

Fix dialect.mlir test case

view details

push time in a month

Pull request review commentmicrosoft/verona

Add flags to help with writing negative tests

 int main(int argc, char** argv)   // Set up pretty-print signal handlers   llvm::InitLLVM y(argc, argv); +  // Register some generic MLIR command line options+  // mlir::registerAsmPrinterCLOptions();

Nope. Thanks.

plietar

comment created time in a month

PullRequestReviewEvent

push eventplietar/verona

Renato Golin

commit sha 9067b6646a9e71ea54fdc356e411148976e351b0

Begining to parse simple classes

view details

Renato Golin

commit sha 459f9a075d2cc44f476cd9307be6f4473982fa4c

Adding class type to symbol table Allowing classes as types for fields, including recursive declaration.

view details

Renato Golin

commit sha 683406f6fd883479ff3263969636370d93b6623b

Add support for class predeclaration Allowing classes that haven't been declared yet to be predeclared and then redefined later. Refactoring `parseClassType` back into `parseClass` to avoid confusion that we need two different type parsers. In a previous iteration I've added that logic straight into `parseType` but there are too many differences and the code was hard to follow. In theory, that should only be done at the point if the class declaration, which is when `parseClass` is called.

view details

Renato Golin

commit sha 8edf225382ee10bcacd6f29977bb2b07c362adfd

Address review comments * Adding comments to all methods that use <class T> for list push_back * Simplifying getClassTypeElements use in parseClass * Adding a new case for multiple pre-declaration uses + test * Update pre-declaration comment (was forward declaration)

view details

Renato Golin

commit sha 7a866dda1b5ada68416f05cce49d7bf083e7e9b5

Simplify parseClass Gets away with getClassTypeElements altogether, thanks to Matt's idea.

view details

Renato Golin

commit sha 4e5a70924cc0515afd507b62ca7fb30811ac527e

Add copyright check to MLIR files Fixes #287

view details

Paul Liétar

commit sha bb1da7d37e663588c957680be79b1f6308cbb0d1

Add flags to help with writing negative tests Two flags are added to the verona-mlir binary to help with writing negative tests, where we expect an error to be raised. --verify-diagnostics allows error messages to be written inline within MLIR files. The compiler succeeds only if all errors listed in the file get emitted. --split-input-file allows for a single input file to be split into pieces, where each piece is treated independently. This works around the fact that MLIR verifiers abort on the first error they find. By splitting the input file we can include multiple related test cases and expected errors in a single file.

view details

Paul Liétar

commit sha ba0f32cb49b5e33e0e58dc4cb8944e96e861f841

Example use of expected errors.

view details

Paul Liétar

commit sha 4eccd0165cf9d954c8a9e1217b7ff099335209c8

Remove left-over code.

view details

push time in a month

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

This is a feature, not a bug, and we shouldn't discard it.

I now it's a feature of llvm::Error, but it's not a feature I want in this case. Essentially with --verify-diagnostics, I don't want to fail if the pass manager fails (it almost certainly will). I could have refactored the driver to not create an llvm::Error in this situation, but do in the normal case.

I'm happy to rework this PR and figure out something cleaner for error handling, assuming we want this PR (or part of it) at all.

plietar

comment created time in a month

pull request commentmicrosoft/verona

Lowers classes as ops/types

How do I know it's a class and not some type I don't know about?

There is no such thing as a type you don't know about. The name resolution pass should be storing a pointer on the type reference to the AST node that defined it, which should tell you what kind of type it is (class, type parameter, ...). Sadly it doesn't yet (as I mentioned in my other comment), but you can probably work around it. See the def.cc file. If ast is the type reference, ast::get_def(ast, ast->token) will give you its definition.

We're not doing the C/C++ thing where definition order matters.

Should I assume that any type I don't know about is a class? If so, then this becomes trivial.

Some types will be generic type parameters, but we don't have a way to represent those in MLIR so it can just abort for now.

We'll need mangling to other MLIR dialects, too Okay sure, depending on the structure of the MLIR pipeline we might need to do some mangling at some point. But the AST should definitely not be using mangled names is what I was getting at.

I'm still not yet convinced having the class operator Me neither. I still think we need something, but I don't know what yet.

rengolin

comment created time in a month

pull request commentmicrosoft/verona

Lowers classes as ops/types

Okay so looking into the ref.cc file, it seems like we perform name resolution only for the sake of changing the kind of ast node, from ref to classdef/local/..., but we don't actually save a pointer to the referenced node anywhere.

@sylvanc what's the status of this? Is there a reason why the pointer isn't persisted into the AST, or just "it wasn't needed at the time".

rengolin

comment created time in a month

pull request commentmicrosoft/verona

Lowers classes as ops/types

The AST uses tokens (strings) to refer to variables, functions, types.

Do we not have a name resolution pass that converts from string identifiers to pointers to AST nodes? I would have thought src/ast/sym.h would take care of it.

It does not support any of those cases yet. Nor it can with the current infrastructure. We'd need to leave type holders and treat them as valid Verona types, or do type inference right after declarations, none of which exists yet.

You can decouple the type and class definition generation. As soon as you see a reference to a class name you generate the corresponding ClassType, even if you haven't seen the definition yet:

  • Look up the corresponding AST node for the class definition.
  • Create a ClassType and add it to the cache
  • Generate the types of each field. If these types mention a class, try to use the cache, if not found.
  • Finish the ClassType definition.

Obviously the first step requires on a preliminary name resolution pass to make it possible to find the definition from the reference.

Multiple modules and classes will use mangling and no two sub-classes can have the same name in the same module/class.

I don't see the point of mangling in early stages of the compiler when we can use structured data and pointer identity instead. In my opinion we should rely on mangling only when it comes to emitting symbols for LLVM, purely because LLVM / ELF / other binary formats don't support namespaces.

rengolin

comment created time in a month

pull request commentmicrosoft/verona

Lowers classes as ops/types

I'm somewhat confused by the way parseClassType is used.

Does the implementation support classes defined in reverse order, where A is defined before B, but A has a field of type B? Seems like the implementation relies on parseClassType having a side-effect of adding to the cache, which would break in this case. As a more complex example, does it support two mutually recursive classes?

I'm somewhat uncomfortable about the String -> Type map to lookup types. How well is that going to work with multiple modules? A mapping from AST node (the class definition one) to Type would be more reliable in my opinion. I don't think there's any need to cache non-class types. That being said, I'm not familiar with the frontend nor with the way name resolution is done, so I could be missing something.

rengolin

comment created time in a month

issue openedmicrosoft/verona

Test infrastructure wishlist

The language test-suite is currently based on a bunch of cmake scripts that configure ctest. It is unstatisfying because it is inflexible and no one wants to write cmake code to extend it. As alternatives we've been considering switching to lit, or rewriting the cmake scripts into a more flexible (set of) python script. Please feel free to edit this post to add new items, or post them as a comment.

Wishlist

  • Configurable compiler invocation: different tests need to run the compiler in different modes.
  • Configurable expectations: some tests expect the compiler to return with a zero exit status, others with a non-zero one.
  • Running multiple tools in sequence: For example we may want to compile a source file to bytecode, then execute this bytecode in the interpreter
  • Pipe the result of one tool into the next: For example, feed to output of verona-mlir into itself, to make sure the IR is round-tripable.
  • Compare the output to a golden file. "Output" may be many things: compiler stderr, IR output, internal data-structure, executed program's stdout/stderr.
  • Compare the output with a set of inline FileCheck directives.

Open questions

  • Should tests be configured per-test or per-directory? The existing testsuite uses the following model: $SUITE/$MODE/$TEST.verona, where $SUITE allows tests to be grouped by feature (eg. name resolution), $MODE is how compiler flags / tool pipeline is configured (eg. compile-pass vs compile-fail).

  • The test infrastructure is composed of test enumeration and test execution. Are we happy to keep the former entirely in CMake, while implementing the latter in Python? The cmake enumeration logic is essentialy "for $f in testsuite/*.verona; add_test(run_test.py $f); done".

created time in 2 months

pull request commentmicrosoft/verona

Add flags to help with writing negative tests

I did a bit of refactoring, replacing llvm::Error by LogicalResult in the driver, because 1) mlir::splitAndProcessBuffer requires it and 2) I got annoyed you can't discard llvm::Errors.

I'm still very unhappy about it (eg. you now can't control how the error message is printed by the driver), so I'm happy to hear alternative opinions.

cc #275

plietar

comment created time in 2 months

PR opened microsoft/verona

Add flags to help with writing negative tests

Two flags are added to the verona-mlir binary to help with writing negative tests, where we expect an error to be raised.

--verify-diagnostics allows error messages to be written inline within MLIR files. The compiler succeeds only if all errors listed in the file get emitted.

--split-input-file allows for a single input file to be split into pieces, where each piece is treated independently. This works around the fact that MLIR verifiers abort on the first error they find. By splitting the input file we can include multiple related test cases and expected errors in a single file.


Based on some discussion we've had today, we discussed the fact that this may not be the way we want to proceed. However I had the PR almost ready, so I figured I may put in out for others to see.

Alternative options include:

  1. Using FileCheck/OutputCheck to check for diagnostics. This is what the old compiler did.
  2. Configure the test pipeline to split the input before feeding it to the compiler.
  3. Use two separate binaries, a fully configurable verona-opt which would include these two

I've realized since discussing these options that

  1. gives us poorer integration with the compiler, eg. FileCheck will only check that all expected diagnostics are emitted. It won't complain if the compiler emits a diagnostic that was not expected by the test case
  2. will cause line numbers to be mumbled up, which could cause confusion. The --split-input-file tries to handles this, although it could probably do a better job at it (eg. an error at line 50, with a split at 44 is rendered as dialect.mlir split at line #44:6:52: error: [...]
+212 -85

0 comment

4 changed files

pr created time in 2 months

create barnchplietar/verona

branch : mlir-negative-test

created branch time in 2 months

PullRequestReviewEvent
more