profile
viewpoint
Andreas Madsen AndreasMadsen Computationally Demanding Copenhagen, Denmark https://andreasmadsen.github.io/

AndreasMadsen/clarify 137

Remove nodecore related stack trace noise

alexgorbatchev/node-browser-builtins 60

Browser altenatives to built-in node.js modules

AndreasMadsen/async-hook 33

Inspect the life of handle objects in node

AndreasMadsen/distributions 32

A collection of probability distribution functions

AndreasMadsen/article 28

Analyze a stream of HTML and outputs the article title, text, and image

AndreasMadsen/course-02456-sparsemax 14

TensorFlow and Numpy implementation of sparsemax

AndreasMadsen/dgram-emitter 12

Very simple EventEmitter on top of a UDP server/client

AndreasMadsen/configme 4

Simplest possible configuration tool. without conflict - with defaulting!

AndreasMadsen/cahier 3

static file conversion, writing and reading

AndreasMadsen/course-42137 2

Optimization using metaheuristics - University Timetabling

issue commentnodejs/node

Proposal: mark AsyncResource and AsyncLocalStorage as stable

+1 on marking AsyncResource as stable. It being experimental is the primary reason module authors don't want to use it. Other parts of async_hooks still needs work, but I don't see it interact with the AsyncResource API.

I would suggest adding a documentation section, regarding what node.js would in the future, when the other parts are stable, consider a major/minor/patch change to the async context graph. This would serve as a default guideline for module authors when they change the async context graph.

PS: Be sure to also mark the C++ AsyncResource API stable as well.

Qard

comment created time in 5 days

startedmjpost/sacrebleu

started time in 6 days

issue commentAndreasMadsen/python-lrcurve

Callback for xgboost

You could also write your own module that depends on lrcurve, and just treat KerasLearningCurve as an example of how to implement it.

I'm definitely not going to implement it myself, I may consider it if a PR is made.

bhishanpdl

comment created time in 8 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 1ce054dfd848d2c7224e08078feb3eed6dbbfcb4

fix some lighthouse issues

view details

push time in 9 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha d49c8ab677a2abafef2480405bbbf6af461fff38

add 'talk' to NMU paper

view details

push time in 17 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha f79c8b08a897663a3a2203e489cbd9d3c10eea91

update titles

view details

push time in 18 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 4cd0a3363f91c92656dd0bb17647c41e0ab8379b

call it Distill Journal

view details

push time in 18 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha a41b896e6055f5ae964f5700b55edf428b4c0ae2

mention nearform reserach

view details

push time in 19 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha ff755c46b9d42d079687f0ff202339b5d6494858

preconnect to google fonts for speed

view details

push time in 19 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha e0640389f0c0b3377e472639b2d6e79a771b51a1

auto load me-picture

view details

push time in 19 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 7eea98b9f39902a5f9051f1798da0872709fc36c

optimize svg

view details

Andreas Madsen

commit sha c255bf479a9bcd2fee4b699a201cf9a4079cf3ad

embed image size in html

view details

push time in 19 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 7fd6465b130602b6c9bdae6e3e321dfc97fdf733

lazy loading of images

view details

push time in 19 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 5cbe71c67f33a531003aa428f9e4df722f8f0285

add new entries

view details

push time in 20 days

push eventAndreasMadsen/summary

Andreas Madsen

commit sha e32e23e3625f12a07d9c92a635ec9cc3f1be4db9

numerically stable sum

view details

Andreas Madsen

commit sha 84f4956c14166c50989b43b607586b4cb1d32767

version 2.1.0

view details

push time in 23 days

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha cb29a422ae1db350b09dc6b6e7f1c7a1483751c9

fix wrong double negative

view details

push time in a month

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 3f767909f971c8a8565fcc397d3812405c444c4a

fix description text

view details

push time in a month

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha 5dfab1d6f0444f208a4b4234c18c86838df02fe2

fix font-smoothing

view details

push time in a month

pull request commenttensorflow/addons

[Backport r0.11] Beautifier layers doc

Rubber stamp LGTM.

bot-of-gabrieldemarmiesse

comment created time in a month

issue closedAndreasMadsen/ttest

Pvalue is NaN

ttest([1,1,1,1], [2,2,2,2]).pValue() === NaN

closed time in a month

brendandahl

issue commentAndreasMadsen/ttest

Pvalue is NaN

A historical variance can now be directly specified using the plain-object API. This is the only reasonable solution to this issue I can think of.

brendandahl

comment created time in a month

issue closedAndreasMadsen/summary

TypeScript types

It would be awesome if you could add TypeScript types to this package. That would help all of us using TypeScript :)

closed time in a month

bfelbo

issue commentAndreasMadsen/summary

TypeScript types

See https://github.com/AndreasMadsen/ttest/issues/14

bfelbo

comment created time in a month

issue closedAndreasMadsen/ttest

TypeScript types

It would be awesome if you could add TypeScript types to this package. That would help all of us using TypeScript :)

closed time in a month

bfelbo

push eventAndreasMadsen/andreasmadsen.github.io

Andreas Madsen

commit sha a6db39722eabf1646e084d9e57014811f16d6c46

fix links in pdf

view details

Andreas Madsen

commit sha e4e6209c08b6a3c8269408c42b0108c3ea795b55

update description

view details

push time in a month

issue commentAndreasMadsen/ttest

TypeScript types

This is not really something I want to maintain. I understand the need, but I've also seen it becoming quite the maintenance burden in other projects. You are of course welcome to submit types to https://github.com/DefinitelyTyped/DefinitelyTyped, as I don't intend to change the API I think that will work fine for you.

bfelbo

comment created time in a month

push eventAndreasMadsen/ttest

Andreas Madsen

commit sha 7a3c1d73144d7ebd007dcd69a7d1d3179440ef34

fix doc formatting

view details

push time in a month

issue closedAndreasMadsen/ttest

Upgrade summary dependency to 1.0

Hey @AndreasMadsen, nice stats packages you've made!

We'd like to use both this and https://github.com/AndreasMadsen/summary 1.0 in the same codebase. However, this package uses summary 0.3.x. Can you update the dependency?

closed time in a month

bfelbo

issue commentAndreasMadsen/ttest

Upgrade summary dependency to 1.0

Hi :D

Thanks for point this out. I've fixed some things in summary. The latest version is now version 2.0.0 and I've updated ttest accordingly.

bfelbo

comment created time in a month

issue closedAndreasMadsen/ttest

Feature Request: "BigNum/BigDecimal" interface

It would be nice to be able to perform calculations on arbitrarily large numbers without having to worry about the limitations of JavaScript's Number (or BigInt for that matter).

closed time in a month

mscdex

issue commentAndreasMadsen/ttest

Feature Request: "BigNum/BigDecimal" interface

Summay 2.0.0 now uses numerically stable algorithms for mean and variance

mscdex

comment created time in a month

push eventAndreasMadsen/ttest

Andreas Madsen

commit sha ca21f214690c3d9157cc931857f539dd9a3287cb

upgrade to summary 2.0.0

view details

Andreas Madsen

commit sha 0b8b73f3ff81a0b69bb7143e5323503c5925b966

version 3.0.0 Major bump, due to summary dependency

view details

push time in a month

push eventAndreasMadsen/summary

Andreas Madsen

commit sha d90a56b59fba13ab745395a9dfe4b1e0a176004d

numerically stable and strict mode

view details

Andreas Madsen

commit sha 2534fe9720cae235c518782e403822abb6be5162

test all branches

view details

Andreas Madsen

commit sha bd1d2f0fc99726a11f55182f1a2ebc16b0303312

version 2.0.0 Major bump, because of different algorithms and enabling strict mode.

view details

Andreas Madsen

commit sha 97e0b564032affd8ce90a97682e4a4fa0e2ddc4f

ignore .nyc_output

view details

push time in a month

PublicEvent

pull request commenthuggingface/transformers

Fix saved model creation

@jplu What are the issues in deleting cast_bool_to_primitive altogether?

jplu

comment created time in 2 months

issue commentnodejs/diagnostics

Proposal for reworking promise integration into async_hooks

@puzpuzpuz Well, that is really one of the primary benefits of this approach.

AndreasMadsen

comment created time in 2 months

issue commenttensorflow/tensorflow

Keras Metric Multiple Outputs / Inputs

Just wanted to echo this!

It is currently not possible to have a metric that aggregates multiple model-outputs, without resorting to callback hacks. A common real-world example is the Labeled Attachment Score (LAS), often used in NLP Dependency Parsing. A model essentially outputs both an index and a label, and the LAS is essentially an accuracy measure for when both matches.

AmitMY

comment created time in 2 months

Pull request review commenthuggingface/transformers

Fix saved model creation

 def call(self, inputs, training=False):             k = tf.concat((past_key, k), axis=-2)             v = tf.concat((past_value, v), axis=-2) -        # to cope with keras serialization-        use_cache = cast_bool_to_primitive(use_cache, True)-         if use_cache is True:

Be careful with these is True statements. Since they won't work if True is accidentally cast to tf.constant(True).

jplu

comment created time in 2 months

issue commenthuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

Nevertheless, I've just spotted another problem with the usage of TensorArray, instead to have a tuple that looks like : (batch_size, num_tokens, hidden_states_size) * num_layers we get a tensor that looks like (num_layers, batch_size, num_tokens, hidden_states_size) which cause several other issues for later.

To avoid a breaking change, you could do it as a tuple of empty tensors. For version next major version, I would suggest it become one big tensor. You can swap transpose/swap the first and last axis, to make it mostly compatible, with indexing, unpack, etc..

AndreasMadsen

comment created time in 2 months

issue commenthuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

Hi @jplu, I will leave it up to you to decide what is "wanted". But you should consider the usage pattern when unpacking the output:

with always three outputs:

@tf.function
def usage(hasOutput1, hasOutput2):
  (one, output1, output2) = output(hasOutput1, hasOutput2)

  tf.print(one)
  if hasOutput1:
    tf.print(output1)
  if hasOutput2:
    tf.print(output2)

with always variable outputs:

@tf.function
def usage(hasOutput1, hasOutput2):

  output1 = tf.zeros((0,))
  output2 = tf.zeros((0,))
  if hasOutput1 and hasOutput2:
    (one, output1, output2) = output(hasOutput1, hasOutput2)
  elif hasOutput1:
    (one, output1) = output(hasOutput1, hasOutput2)
  elif hasOutput2:
    (one, output2) = output(hasOutput1, hasOutput2)
  else:
    (one, ) = output(hasOutput1, hasOutput2)

  tf.print(one)
  if hasOutput1:
    tf.print(output1)
  if hasOutput2:
    tf.print(output2)
AndreasMadsen

comment created time in 2 months

issue commenthuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

@jplu Good point, regarding the variable output-signature. I think it is perfectly acceptable to assert for a primitive input, at least for now.

Alternatively, the solution would be to return an empty tensor when tf.constant([False]) is used. Such an approach could look like this:

import tensorflow as tf

@tf.function
def output(output1, output2):
    first_conditional_output = tf.TensorArray(tf.int64, size=0, dynamic_size=True, clear_after_read=True)
    second_conditional_output = tf.TensorArray(tf.int64, size=0, dynamic_size=True, clear_after_read=True)

    for i in range(2):
        if output1:
            first_conditional_output = first_conditional_output.write(i, i)
        if output2:
            second_conditional_output = second_conditional_output.write(i, i)

    outputs = (0,)
    if isinstance(output1, tf.Tensor) or output1:
        outputs = outputs + (first_conditional_output.stack(),)

    if isinstance(output2, tf.Tensor) or output2:
        outputs = outputs + (second_conditional_output.stack(),)

    return outputs
AndreasMadsen

comment created time in 2 months

issue commenthuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

@jplu Sorry for the misunderstanding. The test now works because output_hidden_states is now an auxiliary-input, thus is stays a primitive, thus the casting is no longer involved. However, it is still not good practice in TensorFlow and doesn't take much to break it.

I read your code more thoroughly, and have the following two failure cases for you.

Case 1:

bert = tf.function(transformers.TFBertForMaskedLM.from_pretrained('bert-base-uncased'))
outputs = bert(tf.constant([[10,11,12]]), output_attentions=True)

Fails with an IndexError. Essentially because casting in a tf.function can not be done.

Case 2:

bert = tf.function(transformers.TFBertForMaskedLM.from_pretrained('bert-base-uncased'))
outputs = bert(tf.constant([[10,11,12]]), output_hidden_states=tf.constant(True))

outputs only one output (two is exected), because tf.constant can not be casted in a tf.function.


However, I would like that instead of working around these issues you read the documentation on AutoGraph.

I really think there is a misunderstanding here, about what TensorFlow AutoGraph can do for you and why casting to primitives is really not necessary at all. I would suggest you read https://www.tensorflow.org/guide/function#conditionals and also check out the hidden docs, such as https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/autograph/g3doc/reference/control_flow.md#effects-of-the-tracing-process which explains it in more details.

What @foxik says is true, but I don't think depending on the auxiliary-inputs just avoids the misunderstanding. Truly, casting to primitives is just not a good idea.

AndreasMadsen

comment created time in 2 months

PublicEvent

issue commenthuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

Hi @jplu, I'm sorry, but I doubt #5468 will fix the issue. Fundamentally speaking casting to primitives is not a good practice in TensorFlow, as it invalidates the use of @tf.function and is generally unnecessary as described above. Casting to primitives is, in my experience, just never the correct solution in TensorFlow.

I do think #5468 mitigates the issue, which is maybe where the confusion is coming from. This is because, the models will now correctly default to the config object when output_hidden_states=True is not specified as an input. In those cases object property is never cast to a tensor to begin with, therefore the @tf.function graph will be statically compiled to always output the hidden_states, as intended.

However, the behavior is different when output_hidden_states=True is specified as an input, as it will be cast to a Tensor when it becomes part of the inputs argument in call(). After that, it is not possible to convert back to a primitive, as that invalidates @tf.function.

If you insist on keeping it as a primitive, the best solution might be to specify it as an aux-input, similar to training and mask in a keras.layers.Layer, as they don't get converted the same way. I'm not familiar enough with the Keras internals to know the details here, and I think it might also be incompatible with compute_output_shape etc.

BTW, in the keras RNN layers, hidden_state is only specified in the constructor, properly because it can get a bit messy having to specify it in the inputs, but I don't see anything fundamentally wrong with specifying it in inputs.

AndreasMadsen

comment created time in 2 months

issue openedhuggingface/transformers

cast_bool_to_primitive breaks TensorFlow graph support.

🐛 Bug

To reproduce

import transformers
bert = tf.function(transformers.TFBertForMaskedLM.from_pretrained('bert-base-uncased'))

for i in range(2):
    (_, hidden_state) = bert(tf.constant([[10,11,12]]), output_hidden_states=True)
    print(f'computed {i}')

Errors with

ValueError: not enough values to unpack (expected 2, got 1)

Expected behavior

computed 1
computed 2

Same result as if tf.function was not used.

Environment info

Example environment : https://colab.research.google.com/gist/AndreasMadsen/593df94a3319dee58bba33a26efedeb3/untitled6.ipynb

  • transformers version: 3.0.2
  • Platform: Linux-4.19.104+-x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.5.1+cu101 (False)
  • Tensorflow version (GPU?): 2.2.0 (False)
  • Using GPU in script?: <fill in>
  • Using distributed or parallel set-up in script?: <fill in>

Details

The bug happens due to cast_bool_to_primitive, that was introduced in https://github.com/huggingface/transformers/commit/6e603cb7892b49a2cbbc10ba859759f92c3fb7a6. Before that, it was possible to get the hidden_states from Bert in TensorFlow graph/function mode.

Generally speaking, casting TensorFlow tensors to primitives is not a good practice, as it only works in eager mode. It is also completely unnecessary in this case, as using if bool_tensor_scalar: works perfectly fine.

def print_bool(x):
   if x:
       print('True')
   else:
       print('False')

print_bool_graph = tf.function(print_bool)

print('eager:')
print_bool(True) # Prints True
print_bool(False) # Prints False
print_bool(tf.constant(True)) # Prints True
print_bool(tf.constant(False)) # Prints False

print('')
print('graph:')
print_bool_graph(True) # Prints True
print_bool_graph(False) # Prints False
print_bool_graph(tf.constant(True)) # Prints True
print_bool_graph(tf.constant(False)) # Prints False

I can see there are some cases where defaults are used. The right way to handle that is to implement the default handling upstream in the first call() method. A lesser way would be to implement it as:

def cast_bool_to_primitive(x, default_value=False):
  if x is None:
    return default_value
  return x

created time in 2 months

push eventAndreasMadsen/python-lrcurve

Andreas Madsen

commit sha 50de38553d8da6076e94a0c5c9fc9f22cc0e1c95

add support for log10 y-scale

view details

Andreas Madsen

commit sha db97d4ba07d007af25e3b5142723ff7fa63f8a88

update notebooks

view details

Andreas Madsen

commit sha 8e61cbca81e313763c8b6d7db773f322761fe065

version 2.1.0

view details

push time in 3 months

more