profile
viewpoint

swoop-inc/elasticsearch-statsd-plugin 15

This repo is now archived. Please use Automattic's fork for updates.

swoop-inc/fast_cache 9

Very fast in-process cache with least-recently used (LRU) and time-to-live (TTL) expiration semantics.

swoop-inc/composable_state_machine 4

Tiny state machine implementation with clean separation between transitions, transition logic & state management.

swoop-inc/digester 2

Ruby data structure digester using MD5 hashes

swoop-inc/cuttle 1

An embedded job scheduler.

swoop-inc/datascience 1

Data and code for our data science writing

swoop-inc/graflux 1

InfluxDB storage adaptor for graphite-api

swoop-inc/activerecord-import 0

A library for bulk insertion of data into your database using ActiveRecord.

swoop-inc/amphtml 0

AMP HTML source code, samples, and documentation. See below for more info.

startedswoop-inc/spark-alchemy

started time in 15 hours

startedswoop-inc/spark-alchemy

started time in 4 days

push eventswoop-inc/postgraphile-upsert-plugin

Greg Lu

commit sha 6b3138a4c7ad228353b680fb71b23fa76f89c971

Fixed tsc build in postinstall

view details

push time in 5 days

push eventswoop-inc/postgraphile-upsert-plugin

Greg Lu

commit sha 2615ef894b63a20956d60380ff23488264eb8a05

Changed name in package.json

view details

push time in 5 days

create barnchswoop-inc/postgraphile-upsert-plugin

branch : swoop

created branch time in 5 days

issue commentswoop-inc/spark-records

Fix the GitHub page for this project

That is very odd: the gh-pages branch hasn't been updated in years.

MrPowers

comment created time in 8 days

startedswoop-inc/spark-alchemy

started time in 11 days

push eventswoop-inc/spark-records

MrPowers

commit sha 412fe5125b61b00b490e01f249ae4af1ce8b714b

Remove spark-test-sugar dependency

view details

MrPowers

commit sha 57aa9b727662df0f776e62cbb757f5efd01f7830

Use 2 cores when running tests

view details

Simeon Simeonov

commit sha 5ea2935e015186c284e2ed87494d744fc3e182d3

Merge pull request #9 from MrPowers/remove-test-sugar Remove spark-test-sugar dependency

view details

push time in 11 days

PR merged swoop-inc/spark-records

Remove spark-test-sugar dependency

Removing the spark-test-sugar dependency will let us cross compile this library with Scala 2.12.

+33 -12

0 comment

5 changed files

MrPowers

pr closed time in 11 days

PullRequestReviewEvent

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

 package examples.fancy_numbers +import com.swoop.spark.SparkSessionTestWrapper import com.swoop.spark.records._-import com.swoop.spark.test.SparkSqlSpec import org.apache.spark.sql.Dataset import org.apache.spark.storage.StorageLevel  -class SparkTest extends ExampleSpec with SparkSqlSpec with TestNegative5To100 {+class SparkTest extends ExampleSpec with SparkSessionTestWrapper with TestNegative5To100 { +  val sc = spark.sparkContext   lazy val dc = SimpleDriverContext(sc)   lazy val jc = dc.jobContext(SimpleJobContext)   lazy val ds = recordsDataset(-5 to 100, jc)   lazy val records = ds.collect    "in an integration test" - {     implicit val env = FlatRecordEnvironment()-    val sqlContext = sqlc-    import sqlContext.implicits._+    import spark.implicits._      behave like fancyRecordBuilder(records, jc)      "should build records with Spark" in {       ds.count should be(105)     }+

Yep, agreed, updated!

MrPowers

comment created time in 11 days

PullRequestReviewEvent

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+package com.swoop.spark++import org.apache.spark.sql.SparkSession++trait SparkSessionTestWrapper {++  lazy val spark: SparkSession = {+    SparkSession+      .builder()+      .master("local")+      .appName("spark-records")+      .config(

Copied this over from a project using scalafmt. Updated the code to put this all on one line. My Scala formatting is the worst, so fine formatting it however you like.

MrPowers

comment created time in 11 days

PullRequestReviewEvent

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+package com.swoop.spark++import org.apache.spark.sql.SparkSession++trait SparkSessionTestWrapper {++  lazy val spark: SparkSession = {+    SparkSession+      .builder()+      .master("local")

Great point, updated to 2 cores & 4 shuffle partitions.

MrPowers

comment created time in 11 days

PullRequestReviewEvent

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+# Set everything to be logged to the console+log4j.rootCategory=ERROR, console+log4j.appender.console=org.apache.log4j.ConsoleAppender+log4j.appender.console.target=System.err+log4j.appender.console.layout=org.apache.log4j.PatternLayout+log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n++# Settings to quiet third party logs that are too verbose+log4j.logger.org.eclipse.jetty=WARN+log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR+log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=WARN+log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=WARN

Fixed, good catch ;)

MrPowers

comment created time in 11 days

PullRequestReviewEvent

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

 package examples.fancy_numbers +import com.swoop.spark.SparkSessionTestWrapper import com.swoop.spark.records._-import com.swoop.spark.test.SparkSqlSpec import org.apache.spark.sql.Dataset import org.apache.spark.storage.StorageLevel  -class SparkTest extends ExampleSpec with SparkSqlSpec with TestNegative5To100 {+class SparkTest extends ExampleSpec with SparkSessionTestWrapper with TestNegative5To100 { +  val sc = spark.sparkContext   lazy val dc = SimpleDriverContext(sc)   lazy val jc = dc.jobContext(SimpleJobContext)   lazy val ds = recordsDataset(-5 to 100, jc)   lazy val records = ds.collect    "in an integration test" - {     implicit val env = FlatRecordEnvironment()-    val sqlContext = sqlc-    import sqlContext.implicits._+    import spark.implicits._      behave like fancyRecordBuilder(records, jc)      "should build records with Spark" in {       ds.count should be(105)     }+

No need for the extra whitespace for such simple tests: the description and { ... } create sufficient visual separation without wasting vertical space.

MrPowers

comment created time in 11 days

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+package com.swoop.spark++import org.apache.spark.sql.SparkSession++trait SparkSessionTestWrapper {++  lazy val spark: SparkSession = {+    SparkSession+      .builder()+      .master("local")+      .appName("spark-records")+      .config(

What's the value of one config setting being split across 4 lines?

MrPowers

comment created time in 11 days

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+package com.swoop.spark++import org.apache.spark.sql.SparkSession++trait SparkSessionTestWrapper {++  lazy val spark: SparkSession = {+    SparkSession+      .builder()+      .master("local")

Running tests with a single worker and a single partition can hide errors related to parallel execution. We tend to run local tests with 2 workers.

MrPowers

comment created time in 11 days

Pull request review commentswoop-inc/spark-records

Remove spark-test-sugar dependency

+# Set everything to be logged to the console+log4j.rootCategory=ERROR, console+log4j.appender.console=org.apache.log4j.ConsoleAppender+log4j.appender.console.target=System.err+log4j.appender.console.layout=org.apache.log4j.PatternLayout+log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n++# Settings to quiet third party logs that are too verbose+log4j.logger.org.eclipse.jetty=WARN+log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR+log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=WARN+log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=WARN

Missing newline.

MrPowers

comment created time in 11 days

PullRequestReviewEvent
PullRequestReviewEvent

issue openedswoop-inc/spark-records

Fix the GitHub page for this project

The GitHub page isn't working at the moment.

It's giving a 404 "There isn't a GitHub Pages site here" error.

The console says "Failed to load resource: the server responded with a status of 404 ()".

I say we update all the plugin versions to match spark-alchemy, regenerate the GitHub pages, and see if that fixes the problem. Sounds good?

created time in 15 days

startedswoop-inc/spark-records

started time in 19 days

push eventswoop-inc/spark-records

MrPowers

commit sha 5b65aaeec23a613fccc74cabc58d1f831af91ed9

Add resolver to fetch the spark-test-sugar dependency

view details

Simeon Simeonov

commit sha b9e65b900b8b0af26d0717d0fbc1c689c9948123

Merge pull request #7 from MrPowers/second-attempt-fix-build Add resolver to fetch the spark-test-sugar dependency

view details

push time in 20 days

PR merged swoop-inc/spark-records

Add resolver to fetch the spark-test-sugar dependency

Thanks for building this library 😄

I was getting this error when running sbt test:

[info] Resolving com.swoop#spark-test-sugar_2.11;1.5.0 ...
[warn] 	module not found: com.swoop#spark-test-sugar_2.11;1.5.0
[warn] ==== local: tried
[warn]   /Users/matthewpowers/.ivy2/local/com.swoop/spark-test-sugar_2.11/1.5.0/ivys/ivy.xml
[warn] ==== public: tried
[warn]   https://repo1.maven.org/maven2/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] ==== local-preloaded-ivy: tried
[warn]   /Users/matthewpowers/.sbt/preloaded/com.swoop/spark-test-sugar_2.11/1.5.0/ivys/ivy.xml
[warn] ==== local-preloaded: tried
[warn]   file:////Users/matthewpowers/.sbt/preloaded/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] ==== tpolecat: tried
[warn]   http://dl.bintray.com/tpolecat/maven/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	::          UNRESOLVED DEPENDENCIES         ::
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	:: com.swoop#spark-test-sugar_2.11;1.5.0: not found
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::

It was looking in http://dl.bintray.com/tpolecat/maven/com/swoop/spark-test-sugar_2.11 for spark-test-sugar instead of in https://dl.bintray.com/swoop-inc/maven/. Adding the resolver in the build.sbt file fixes this build on my machine. Quite the strange error message!

+2 -0

0 comment

2 changed files

MrPowers

pr closed time in 20 days

PullRequestReviewEvent

startedswoop-inc/spark-records

started time in 21 days

startedswoop-inc/spark-records

started time in 21 days

PR opened swoop-inc/spark-records

Add resolver to fetch the spark-test-sugar dependency

Thanks for building this library 😄

I was getting this error when running sbt test:

[info] Resolving com.swoop#spark-test-sugar_2.11;1.5.0 ...
[warn] 	module not found: com.swoop#spark-test-sugar_2.11;1.5.0
[warn] ==== local: tried
[warn]   /Users/matthewpowers/.ivy2/local/com.swoop/spark-test-sugar_2.11/1.5.0/ivys/ivy.xml
[warn] ==== public: tried
[warn]   https://repo1.maven.org/maven2/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] ==== local-preloaded-ivy: tried
[warn]   /Users/matthewpowers/.sbt/preloaded/com.swoop/spark-test-sugar_2.11/1.5.0/ivys/ivy.xml
[warn] ==== local-preloaded: tried
[warn]   file:////Users/matthewpowers/.sbt/preloaded/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] ==== tpolecat: tried
[warn]   http://dl.bintray.com/tpolecat/maven/com/swoop/spark-test-sugar_2.11/1.5.0/spark-test-sugar_2.11-1.5.0.pom
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	::          UNRESOLVED DEPENDENCIES         ::
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	:: com.swoop#spark-test-sugar_2.11;1.5.0: not found
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::

It was looking in http://dl.bintray.com/tpolecat/maven/com/swoop/spark-test-sugar_2.11 for spark-test-sugar instead of in https://dl.bintray.com/swoop-inc/maven/. Adding the resolver in the build.sbt file fixes this build on my machine. Quite the strange error message!

+2 -0

0 comment

2 changed files

pr created time in 22 days

fork MrPowers/spark-records

Bulletproof Apache Spark jobs with fast root cause analysis of failures.

https://swoop-inc.github.io/spark-records/

fork in 22 days

startedswoop-inc/spark-records

started time in 22 days

issue openedswoop-inc/spark-alchemy

Installing spark-alchemy on spark 3.0 breaks reading dataframes in

I recently tried installing spark-alchemy using spark 3.0 using the following:

spark-shell --repositories https://dl.bintray.com/swoop-inc/maven/ --packages com.swoop:spark-alchemy_2.12:1.0.0

However, when in the shell, I can't read in any files. The following code returns:

val df = spark.read.parquet(“path/to/file")

Note: When I do a regular spark-shell I can read in the data fine

20/09/21 13:16:17 ERROR Utils: Aborting task                        (0 + 1) / 1]

java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

20/09/21 13:16:17 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)

java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

20/09/21 13:16:17 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, m298188dljg5j.symc.symantec.com, executor driver): java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

 

20/09/21 13:16:17 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, m298188dljg5j.symc.symantec.com, executor driver): java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

 

Driver stacktrace:

  at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2023)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1972)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1971)

  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1971)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:950)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:950)

  at scala.Option.foreach(Option.scala:407)

  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:950)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2203)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2152)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2141)

  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)

  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:752)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2093)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2114)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2133)

  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:467)

  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:420)

  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47)

  at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3625)

  at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2695)

  at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616)

  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)

  at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)

  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)

  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)

  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)

  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614)

  at org.apache.spark.sql.Dataset.head(Dataset.scala:2695)

  at org.apache.spark.sql.Dataset.take(Dataset.scala:2902)

  at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:114)

  at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:67)

  at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:62)

  at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$11(DataSource.scala:193)

  at scala.Option.orElse(Option.scala:447)

  at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:190)

  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:401)

  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:279)

  at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:268)

  at scala.Option.getOrElse(Option.scala:189)

  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:268)

  at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:705)

  at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:535)

  ... 47 elided

Caused by: java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

  at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

  at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

  at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

  at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

  at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

  at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

  at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

  at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

  at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

  at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

  at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

  at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

  at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

  at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

  at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

  at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

  at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

  at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

  at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

  at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

  at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

  at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

  at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

  at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

  at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

  at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

  at java.base/java.lang.Thread.run(Thread.java:834)

created time in a month

startedswoop-inc/spark-alchemy

started time in a month

fork xwzpp/spark-alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive

https://swoop-inc.github.io/spark-alchemy/

fork in a month

startedswoop-inc/spark-alchemy

started time in a month

startedswoop-inc/spark-alchemy

started time in a month

push eventswoop-inc/postgraphile-plugin-fulltext-filter

Greg Lu

commit sha 6f6d3a76847c5616c35dc24f268bd5871f022a2d

Bump version

view details

push time in 2 months

create barnchswoop-inc/postgraphile-plugin-fulltext-filter

branch : swoop

created branch time in 2 months

startedswoop-inc/spark-alchemy

started time in 2 months

startedswoop-inc/spark-alchemy

started time in 2 months

startedswoop-inc/spark-alchemy

started time in 2 months

GollumEvent
GollumEvent

push eventswoop-inc/spark-alchemy

Aditya Chaganti

commit sha 25c2feaf87ce50d39d769bef1ca89130c98116e1

Adds doc describing the use of HLL functions through PySpark

view details

Aditya Chaganti

commit sha 23858c61774005256d10f5f1bd5b0f7c5d7dcb9e

Adds sbt plugin to enable the easy assembly of a fat JAR of spark-alchemy

view details

push time in 2 months

GollumEvent

startedswoop-inc/spark-alchemy

started time in 2 months

startedswoop-inc/spark-alchemy

started time in 2 months

startedswoop-inc/spark-records

started time in 2 months

push eventswoop-inc/fast_cache

Simeon Simeonov

commit sha 5111a30f829604c5ddab5e7cc6563785ae6df5a8

Upgrades Ruby to 2.7.1

view details

push time in 2 months

create barnchswoop-inc/fast_cache

branch : ss_ruby271

created branch time in 2 months

push eventswoop-inc/composable_state_machine

Simeon Simeonov

commit sha b08271971806e591ceda9931e63b14e265257898

Updates Ruby to 2.7.1

view details

push time in 2 months

startedswoop-inc/cuttle

started time in 2 months

push eventswoop-inc/fast_cache

Simeon Simeonov

commit sha 9245e3afcd787fa392d18dabf526b41c531cd72f

Removes Travis CI as it is being replaced

view details

push time in 3 months

push eventswoop-inc/fast_cache

Simeon Simeonov

commit sha 266a2d8ab80750d94021537841477a9a8cbe0404

Removes broken link

view details

push time in 3 months

push eventswoop-inc/composable_state_machine

Simeon Simeonov

commit sha ca3fd7c843843c88a529b92d4147cd20b448a95b

Removes broken link

view details

push time in 3 months

push eventswoop-inc/composable_state_machine

Simeon Simeonov

commit sha 89da08d2ec2069bd6ed67496f2354b9244b8b6f6

Removes Gemnasium dependency

view details

push time in 3 months

push eventswoop-inc/fast_cache

Simeon Simeonov

commit sha 98cc98b591ba4c9b9843817b7218af2a94584fb6

Removed Gemnasium dependency

view details

push time in 3 months

startedswoop-inc/spark-alchemy

started time in 3 months

more