Open Business Intelligence

La red del Business Intelligence

Hola 

Estoy intentando leer ficheros de un HDFS en Hadoop desde Pentaho DI y me da problemillas:

- Pentaho DI (open source)  machine Win 7 version 6.1
- HDFS en Virtual Machine Cloudera Quick Start 5.4

He indicado my distribución de Hadoop (cdh55) en Tools, He creado un cluster que apunta a la MV y al probarlo funciona correctamente.

Entonces creo un paso Hadoop File Input, puedo explorar los directorios del cluster y seleccionar u fichero, pero tengo problemas al intentar leerlo::

......

Hadoop File Input.0 - ERROR (version 6.1.0.1-196, build 1 from 2016-04-07 12.08.49 by buildguy) : Couldn't open file #1 : hdfs://cloudera:***@192.168.109.128:8020/user/cloudera/prueba.txt --> org.pentaho.di.core.exception.KettleFileException:
2016/09/20 13:27:02 - Hadoop File Input.0 -
2016/09/20 13:27:02 - Hadoop File Input.0 - Exception reading line: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742215_1393 file=/user/cloudera/prueba.txt
2016/09/20 13:27:02 - Hadoop File Input.0 - Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742215_1393 file=/user/cloudera/prueba.txt
2016/09/20 13:27:02 - Hadoop File Input.0 -
2016/09/20 13:27:02 - Hadoop File Input.0 - Could not obtain block: BP-286282631-127.0.0.1-1433865208026:blk_1073742215_1393 file=/user/cloudera/prueba.txt
2016/09/20 13:27:02 - Hadoop File Input.0 - Procesamiento finalizado (I=0, O=0, R=0, W=0, U=1, E=1)
2016/09/20 13:27:02 - C:\temp\ejercicios pentaho\tr_conexion_hdfs.ktr : tr_conexion_hdfs - Transformaci�n detectada
2016/09/20 13:27:02 - C:\temp\ejercicios pentaho\tr_conexion_hdfs.ktr : tr_conexion_hdfs - Transformaci�n est� matando los otros pasos!

.....

No estoy seguro de que el problema sea Pentaho, he chequeado el servicio HDFS de Hadoop y está activo. 

Alguien puede ayudarme?

Gracias por adelantado

Juan

Visitas: 615

Respuestas a esta discusión

Buenas tardes,

Vuelvo a pedir ayuda a la talentosa comunidad reopenbi:

He hecho algunos avances, pero esto sigue sin funcionar:

He descubierto, gracias a esta página: https://cheranilango.blogspot.com.es

que me faltaban ficheros de configuración.

En la ruta:

data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations

Tienes que existir

  1. core-site.xml 
  2. hbase-site.xml 
  3. hdfs-site.xml 
  4. hive-site.xml 
  5. mapred-site.xml 
  6. yarn-site.xml

 

Revisar que están todos los necesarios, si falta alguno copiarlo de la MV, del directorio

/etc/hive/conf    o de    /etc/hbase/conf

Una vez hecho esto, parece que ahora si encuentra el fichero, pero da un error en la lectura.

Este error creo que si es más identificable, lo adjunto por si me podeís ayudar. Gracias:

at org.pentaho.commons.launcher.Launcher.main (Launcher.java:92)
at java.lang.reflect.Method.invoke (null:-1)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (null:-2)
at org.pentaho.di.ui.spoon.Spoon.main (Spoon.java:662)
at org.pentaho.di.ui.spoon.Spoon.start (Spoon.java:9269)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose (Spoon.java:7989)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch (Spoon.java:1347)
at org.eclipse.swt.widgets.Display.readAndDispatch (null:-1)
at org.eclipse.swt.widgets.Display.runDeferredEvents (null:-1)
at org.eclipse.swt.widgets.Widget.sendEvent (null:-1)
at org.eclipse.swt.widgets.EventTable.sendEvent (null:-1)
at org.eclipse.jface.action.ActionContributionItem$5.handleEvent (ActionContributionItem.java:402)
at org.eclipse.jface.action.ActionContributionItem.access$2 (ActionContributionItem.java:490)
at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection (ActionContributionItem.java:545)
at org.eclipse.jface.action.Action.runWithEvent (Action.java:498)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem$1.run (JfaceMenuitem.java:106)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem.access$100 (JfaceMenuitem.java:43)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke (AbstractXulComponent.java:141)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke (AbstractXulComponent.java:157)
at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke (AbstractXulDomContainer.java:313)
at java.lang.reflect.Method.invoke (null:-1)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (null:-2)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep (TransGraph.java:2129)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep (TransGraph.java:3072)
at org.pentaho.di.ui.spoon.Spoon.editStep (Spoon.java:8783)
at org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep (SpoonStepsDelegate.java:125)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.open (HadoopFileInputDialog.java:575)
at org.eclipse.swt.widgets.Display.readAndDispatch (null:-1)
at org.eclipse.swt.widgets.Display.runDeferredEvents (null:-1)
at org.eclipse.swt.widgets.Widget.sendEvent (null:-1)
at org.eclipse.swt.widgets.EventTable.sendEvent (null:-1)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog$3.handleEvent (HadoopFileInputDialog.java:482)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.access$200 (HadoopFileInputDialog.java:125)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.first (HadoopFileInputDialog.java:2634)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.getFirst (HadoopFileInputDialog.java:2722)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine (TextFileInput.java:97)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine (TextFileInput.java:127)
at java.io.InputStreamReader.read (null:-1)
at sun.nio.cs.StreamDecoder.read (null:-1)
at sun.nio.cs.StreamDecoder.read0 (null:-1)
at sun.nio.cs.StreamDecoder.read (null:-1)
at sun.nio.cs.StreamDecoder.implRead (null:-1)
at sun.nio.cs.StreamDecoder.readBytes (null:-1)
at org.pentaho.di.core.compress.CompressionInputStream.read (CompressionInputStream.java:68)
at org.apache.commons.vfs2.util.MonitorInputStream.read (MonitorInputStream.java:99)
at java.io.BufferedInputStream.read (null:-1)
at java.io.BufferedInputStream.read1 (null:-1)
at java.io.DataInputStream.read (null:-1)
at org.apache.hadoop.hdfs.DFSInputStream.read (DFSInputStream.java:903)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy (DFSInputStream.java:851)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo (DFSInputStream.java:624)
at org.apache.hadoop.hdfs.BlockReaderFactory.build (BlockReaderFactory.java:374)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp (BlockReaderFactory.java:753)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer (BlockReaderFactory.java:838)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer (DFSClient.java:3492)
at org.apache.hadoop.net.NetUtils.connect (NetUtils.java:530)
at org.apache.hadoop.net.SocketIOWithTimeout.connect (SocketIOWithTimeout.java:192)
at sun.nio.ch.SocketChannelImpl.connect (null:-1)
at sun.nio.ch.Net.checkAddress (null:-1)

at org.pentaho.commons.launcher.Launcher.main (Launcher.java:92)
at java.lang.reflect.Method.invoke (null:-1)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (null:-2)
at org.pentaho.di.ui.spoon.Spoon.main (Spoon.java:662)
at org.pentaho.di.ui.spoon.Spoon.start (Spoon.java:9269)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose (Spoon.java:7989)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch (Spoon.java:1347)
at org.eclipse.swt.widgets.Display.readAndDispatch (null:-1)
at org.eclipse.swt.widgets.Display.runDeferredEvents (null:-1)
at org.eclipse.swt.widgets.Widget.sendEvent (null:-1)
at org.eclipse.swt.widgets.EventTable.sendEvent (null:-1)
at org.eclipse.jface.action.ActionContributionItem$5.handleEvent (ActionContributionItem.java:402)
at org.eclipse.jface.action.ActionContributionItem.access$2 (ActionContributionItem.java:490)
at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection (ActionContributionItem.java:545)
at org.eclipse.jface.action.Action.runWithEvent (Action.java:498)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem$1.run (JfaceMenuitem.java:106)
at org.pentaho.ui.xul.jface.tags.JfaceMenuitem.access$100 (JfaceMenuitem.java:43)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke (AbstractXulComponent.java:141)
at org.pentaho.ui.xul.impl.AbstractXulComponent.invoke (AbstractXulComponent.java:157)
at org.pentaho.ui.xul.impl.AbstractXulDomContainer.invoke (AbstractXulDomContainer.java:313)
at java.lang.reflect.Method.invoke (null:-1)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke (null:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (null:-2)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep (TransGraph.java:2129)
at org.pentaho.di.ui.spoon.trans.TransGraph.editStep (TransGraph.java:3072)
at org.pentaho.di.ui.spoon.Spoon.editStep (Spoon.java:8783)
at org.pentaho.di.ui.spoon.delegates.SpoonStepsDelegate.editStep (SpoonStepsDelegate.java:125)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.open (HadoopFileInputDialog.java:575)
at org.eclipse.swt.widgets.Display.readAndDispatch (null:-1)
at org.eclipse.swt.widgets.Display.runDeferredEvents (null:-1)
at org.eclipse.swt.widgets.Widget.sendEvent (null:-1)
at org.eclipse.swt.widgets.EventTable.sendEvent (null:-1)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog$3.handleEvent (HadoopFileInputDialog.java:482)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.access$200 (HadoopFileInputDialog.java:125)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.first (HadoopFileInputDialog.java:2634)
at org.pentaho.big.data.kettle.plugins.hdfs.trans.HadoopFileInputDialog.getFirst (HadoopFileInputDialog.java:2722)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine (TextFileInput.java:97)
at org.pentaho.di.trans.steps.textfileinput.TextFileInput.getLine (TextFileInput.java:127)
at java.io.InputStreamReader.read (null:-1)
at sun.nio.cs.StreamDecoder.read (null:-1)
at sun.nio.cs.StreamDecoder.read0 (null:-1)
at sun.nio.cs.StreamDecoder.read (null:-1)
at sun.nio.cs.StreamDecoder.implRead (null:-1)
at sun.nio.cs.StreamDecoder.readBytes (null:-1)
at org.pentaho.di.core.compress.CompressionInputStream.read (CompressionInputStream.java:68)
at org.apache.commons.vfs2.util.MonitorInputStream.read (MonitorInputStream.java:99)
at java.io.BufferedInputStream.read (null:-1)
at java.io.BufferedInputStream.read1 (null:-1)
at java.io.DataInputStream.read (null:-1)
at org.apache.hadoop.hdfs.DFSInputStream.read (DFSInputStream.java:903)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy (DFSInputStream.java:851)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo (DFSInputStream.java:624)
at org.apache.hadoop.hdfs.BlockReaderFactory.build (BlockReaderFactory.java:374)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp (BlockReaderFactory.java:753)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer (BlockReaderFactory.java:838)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer (DFSClient.java:3492)
at org.apache.hadoop.net.NetUtils.connect (NetUtils.java:530)
at org.apache.hadoop.net.SocketIOWithTimeout.connect (SocketIOWithTimeout.java:192)
at sun.nio.ch.SocketChannelImpl.connect (null:-1)
at sun.nio.ch.Net.checkAddress (null:-1)

Buenas, ¿nadie a conectado Pentaho DI con Hadoop?

Se agradecería alguna ayuda

gracias,

RSS

Distintivo

Cargando…

© 2024   Creado por Emilio.   Tecnología de

Emblemas  |  Reportar un problema  |  Términos de servicio