Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Friday, January 24, 2020 6:38 PM
Hi,
We have Linux VM which is used to run Machine learning models. To transform and clean raw source data for this ML models, we need to trigger some shell scripts in that Linux VM. We already have an ADF pipeline which is copying these raw source data into blob container which is mounted as storage (e.g. \dev\rawdata) to this Linux VM.
Is there any way in ADF pipeline to execute shell script into a Azure Linux VM?
Shafi
All replies (3)
Friday, January 24, 2020 6:55 PM
Hello Shafi,
I cannot say if it can be triggered through ADF pipeline but have you taken a look at Custom Script extension for Linux?
/en-us/azure/virtual-machines/extensions/custom-script-linux
Monday, January 27, 2020 10:46 AM
Hi Shafi,
You can now directly run commands, scripts, and your own custom code, compiled as an executable.
You can directly execute a command using Custom Activity. The following example runs the "echo hello world" command on the target Azure Batch Pool nodes and prints the output to stdout.
{
"name": "MyCustomActivity",
"properties": {
"description": "Custom activity sample",
"activities": [{
"type": "Custom",
"name": "MyCustomActivity",
"linkedServiceName": {
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"command": "cmd /c echo hello world"
}
}]
}
}
Note: The custom activity runs your customized code logic on an Azure Batch pool of virtual machines.
For more details, refer “Use custom activities in an Azure Data Factory pipeline”.
You could use Hadoop hive/pig scripts activity in Azure Data Factory.
Hive activity in ADF:
{
"name": "Hive Activity",
"description": "description",
"type": "HDInsightHive",
"linkedServiceName": {
"referenceName": "MyHDInsightLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"scriptLinkedService": {
"referenceName": "MyAzureStorageLinkedService",
"type": "LinkedServiceReference"
},
"scriptPath": "MyAzureStorage\\HiveScripts\\MyHiveSript.hql",
"getDebugInfo": "Failure",
"arguments": [
"SampleHadoopJobArgument1"
],
"defines": {
"param1": "param1Value"
}
}
}
Pig activity in ADF:
{
"name": "Pig Activity",
"description": "description",
"type": "HDInsightPig",
"linkedServiceName": {
"referenceName": "MyHDInsightLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"scriptLinkedService": {
"referenceName": "MyAzureStorageLinkedService",
"type": "LinkedServiceReference"
},
"scriptPath": "MyAzureStorage\\PigScripts\\MyPigSript.pig",
"getDebugInfo": "Failure",
"arguments": [
"SampleHadoopJobArgument1"
],
"defines": {
"param1": "param1Value"
}
}
}
For more details, refer the below links:
Transform data using Hadoop Hive activity in ADF
Transform data using Hadoop Pig activity in ADF
Hope this helps.
Monday, February 3, 2020 9:08 AM
Hi there,
Just wanted to check - was the above suggestion helpful to you? If yes, please consider upvoting and/or marking it as answer. This would help other community members reading this thread.