Click here to Skip to main content
15,868,019 members
Articles / Programming Languages / Visual Basic

Easy Distribution Computing in VisualBasic

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
4 Feb 2016CPOL6 min read 10.8K   5  
Distribution computing framework source code used by http://services.gcmodeller.org online systemsbiology annotation system.

Latest source code on github:

https://github.com/xieguigang/Microsoft.VisualBasic.Parallel.git

App Framework dependency of this library

https://github.com/xieguigang/VisualBasic_AppFramework/tree/master/Microsoft.VisualBasic.Architecture.Framework

Introduction & Background

Recently I was working on two job:

1. The code reconstruct of the online trading system at http://mipaimai.com/ and, 
2. Building the systems biology online annotation system for http://services.gcmodeller.org/ (GCModeller Virtual Cell System).

Both these two project required of a parallel library for the distribution computing framework for processing the large amount of the data.

Image 1

Distribution computing in GCModeller online services

Note:
Each of the http or rest GET/POST request from the browser or PC client, means a task is in queue for invoke the GCModeller data analysis. Due to the reason of most of these computing task is individual to each other, so that a library is required to move these computing task onto the cluster machine in the LAN from the main server to provides a more effective hardware resource utilization and online services quality.

The http server technique of http://services.gcmodeller.org was using my own http server program which was written in VisualBasic and code details was described in previous article: 《Simple HTTP Server in VisualBasic》http://www.codeproject.com/Articles/1068466/Simple-HTTP-Server-in-VisualBasic ,

latest source code of the httpd library can be download from github:

https://github.com/xieguigang/Reference_SharedLib/tree/master/HTTPServer

Using the code

Sometimes the hardware resource on your own local machine probably is not powerful to running a data analysis program, so that allocate the analysis parts to a more powerful machine is necessary; Or the data analysis task process a lots of small parts, then you can using the distribution computing utilities all of your hardware resource in your local network. 

Image 2

Clound computing in movie "The Martian (2015)"

Note:
Generally in order to using the remote computing in System.Runtime.Remoting, There are 6 steps to coding by using System.Runtime.Remoting:
 
1. Design your own remote object: RemoteObject.
2. Register your remote object on server by RemotingConfiguration.RegisterWellKnownServiceType as a WellKnownServiceTypeEntry instance object
3. Register your server by TcpServerChannel into ChannelService
4. Register your client TcpClientChannel into ChannelService
5. Gets remote object by Activator from a URL
6. Using the remote object in your client

Probably you would like to write you program like this way, coding the data analysis program just like a local client code, and reused your previous code running in a distribution computing cluster without any modifications: 

VB.NET
Module API

    ' This is the code that i've written in the previous project, but running in a local program
    ' and now i want to running this code in the cloud but the function its caller is running on 
    ' a local machine

    Public Function Anlysis(file As FileStream) As dataOut()

        ' Long time data analysis for the large dataset probably size up to 1TB or more
        ' blablabla

    End Function
End Module 

but as mention at previous, if you needs to running this code on the cloud by using System.Runtime.Remoting, a lot of your client code and even the previous analysis code project will be modified too.

For directly integrated the server cluster hardware resource in my program, a remote computing library was developed. By using Microsoft.VisualBasic.ComputingServices instead of System.Runtime.Remoting, no needs for modified your previous code anymore. Now I can calling this analysis function on the remote machine in this way in my local machine application:

VB.NET
Dim in As FileStream = New FileStream(DbFile)
' Gets the function pointer from your local machine
Dim Analysis As Func(Of FileStream, dataOut()) = AddressOf API.Analysis
' Gets linq source
Dim Result As ILinq(Of dataOut) = remoteMachine.AsLinq(Of dataOut)(Analysis, {in})
' Or just invoke function
Dim Result As dataOut() = remoteMachine.Invoke(Of dataOut())(Analysis, {in})
Dim downstreamAnalysis = (From x in Result Where (blablabla) Select Project(x))

' blablabla

Just throw the function pointer to the remote machine and then execute it!

How to using this computing framework? summary into 5 steps:

Note:

1. write your data processing program just as usual
2. If a data process step can be decomposing into parts for parallel, then just gets the analysis function delegate pointer, and running the function on the remote server
3. If a data process step can not be processing on local machine, then just gets the analysis function delegate pointer, and then running the function on a more powerful machine.
4. Compile your program and then published (or upload) to the server(Due to the cluster is in a local network, so that a simple powershell script is enough for this upload job to updates all of your machine)
5. Run your server and start analysis job

There are 2 parts of the code in this article to introduce the simply enough distribution computing in VisualBasic:

1. Remote Function Pointer Invoke
2. Remote Linq

First of All, imports the computing services library 

Microsoft.VisualBasic.ComputingServices
Microsoft.VisualBasic.ComputingServices.TaskHost

How to invoke a function in VisualBasic?

Reflection parts of the invokes can be reviewed at Shoal language article: http://www.codeproject.com/Articles/820854/Powerful-ShellScript-for-bioinformatics-researcher

 

Generally the steps of the reflection way to invoke a method in VisualBasic can be summary as:

  1. Get function pointer by using AddressOf operator or lambda expression
  2. Gets the MethodInfo from the delegate
  3. Using MethodInfo.Invoke to invoke the target function.
VB.NET
Public Function CreateObject(func As [Delegate], args As Object()) As InvokeInfo
    Dim type As Type = func.Method.DeclaringType
    Dim assm As Assembly = type.Assembly
    Dim name As String = func.Method.Name
    Dim callsType As Type() = func.Method.GetParameters.ToArray(Function(x) x.ParameterType)
    Dim params As String() = args.ToArray(
        Function(x, idx) Serialization.JsonContract.GetJson(x, callsType(idx)))

    Return New InvokeInfo With {
        .Assembly = FileIO.FileSystem.GetFileInfo(assm.Location).Name,
        .Name = name,
        .Parameters = params,
        .Type = type.FullName
    }
End Function

A tool function code for how to parsing these Reflection information from a Delegate function pointer.

 

For knowing a function location you needs 3 basically requirements:

  1. TypeInfo (System.Reflection.Assembly and System.Type)
  2. Method Name (MethodInfo.Name)
  3. Method Parameters (MethodInfo.GetParameters)

And if the target method have several overloads edition, then the parameter is required for get the corrects edition of your method.

 

Remote Function

Image 3

Schema diagram of the Microsoft.VisualBasic.ComputingServices Parallel library.

Serialization of the Function Pointer

As described how to invoke a function in visualBasic, a function pointer should be serialized in to json string first,  then the program transfer the pointer to the remote server possible. So based on the information in a delegate object can provides, a InvokeInfo object was defined for the remote invocation. 

VB.NET
''' <summary>
''' 分布式计算框架之中的远程调用的参数信息
''' </summary>
Public Class InvokeInfo

    ''' <summary>
    ''' 模块文件
    ''' </summary>
    ''' <returns></returns>
    Public Property Assembly As String
    ''' <summary>
    '''    ''' </summary>
    ''' <returns></returns>
    Public Property Type As String
    ''' <summary>
    ''' 函数名
    ''' </summary>
    ''' <returns></returns>
    Public Property Name As String
    ''' <summary>
    ''' json value.(函数参数)
    ''' </summary>
    ''' <returns></returns>
    Public Property Parameters As String()
End Class

Invoke on the remote machine

When we gets the information on the server side program, we must restored the delegate function pointer information to the object that Reflection operation can recognized:

VB.NET
''' <summary>
''' A common function of invoke the method on the remote machine
''' </summary>
''' <param name="params">远程主机上面的函数指针</param>
''' <param name="value">value's <see cref="system.type"/></param>
''' <returns></returns>
Private Shared Function __invoke(params As InvokeInfo, ByRef value As Type) As Object
    Dim func As MethodInfo = params.GetMethod
    Dim paramsValue As Object() = InvokeInfo.GetParameters(func, params.Parameters)
    Dim x As Object = func.Invoke(Nothing, paramsValue)

    value = func.ReturnType

    Return x
End Function

First, based on the assembly name we load assembly file which we published(or uploaded) on the server

VB.NET
Public Function LoadAssembly() As Assembly
     Dim path As String = App.HOME & "/" & Assembly
     Dim assm As Assembly = System.Reflection.Assembly.LoadFile(path)
     Return assm
End Function

Then, we can get the declaring type of the function pointer

VB.NET
Public Overloads Function [GetType]() As Type
     Dim assm As Assembly = LoadAssembly()
     Dim type As Type = assm.GetType(Me.Type)
     Return type
End Function 

And at last get the function target for the invoke

VB.NET
Public Function GetMethod() As MethodInfo
     Dim type As Type = [GetType]()
     Dim func As MethodInfo = type.GetMethod(Name, BindingFlags.Public Or BindingFlags.Static)
     Return func
End Function

Finally we can gets the parameter value for the function by using json deserialization

VB.NET
''' <summary>
''' 
''' </summary>
''' <param name="method"></param>
''' <param name="args">json</param>
''' <returns></returns>
Public Shared Function GetParameters(method As MethodInfo, args As String()) As Object()
     Dim params As Type() = method.GetParameters.ToArray(Function(x) x.ParameterType)
     Dim values As Object() = args.ToArray(Function(x, idx) Serialization.LoadObject(x, params(idx)))
     Return values
End Function

Remote Invoke Protocol

VB.NET
''' <summary>
''' 本地服务器通过这个方法调用远程主机上面的函数
''' </summary>
''' <param name="target"></param>
''' <param name="args"></param>
''' <returns></returns>
Public Function Invoke(target As [Delegate], ParamArray args As Object()) As Object
     Dim params As InvokeInfo = InvokeInfo.CreateObject(target, args)
     Dim value As String = Serialization.GetJson(params)
     Dim req As RequestStream = New RequestStream(ProtocolEntry, TaskProtocols.Invoke, value)
     Dim rep As RequestStream = New AsynInvoke(_remote).SendMessage(req)
     Dim rtvl As Rtvl = Serialization.LoadObject(Of Rtvl)(rep.GetUTF8String)
     Dim obj As Object = rtvl.GetValue(target)
     Return obj
End Function

Image 4

Net data transfer by protocol handler in the namespace Microsoft.VisualBasic.Net.Protocol

VB.NET
<Protocol(TaskProtocols.Invoke)>
Private Function Invoke(CA As Long, args As RequestStream, remote As System.Net.IPEndPoint) As RequestStream
    Dim params As InvokeInfo = Serialization.LoadObject(Of InvokeInfo)(args.GetUTF8String)
    Dim value As Rtvl = Invoke(params)

    Return New RequestStream(value.GetJson)
End Function

''' <summary>
''' Invoke the function on the remote server.(远程服务器上面通过这个方法执行函数调用)
''' </summary>
''' <param name="params"></param>
''' <returns></returns>
Public Shared Function Invoke(params As InvokeInfo) As Rtvl
    Dim rtvl As Rtvl

    Try
        Dim rtvlType As Type = Nothing
        Dim value As Object = __invoke(params, rtvlType)

        rtvl = New Rtvl(value, rtvlType)

    Catch ex As Exception
        ex = New Exception(params.GetJson, ex)
        rtvl = New Rtvl(ex)

    End Try

    Return rtvl
End Function

Remote Linq

The linq expression is much useful in the VisualBasic programming.

Image 5

Remote source

Generic iterator

For implements a LINQ source an iterator is required:

VB.NET
Imports System.Threading
Imports System.Threading.Thread

Namespace ComponentModel.DataSourceModel

    ''' <summary>
    ''' Implements for the IEnumerable(Of T), Supports a simple iteration over a non-generic collection.
    ''' </summary>
    Public Class Iterator : Implements IEnumerator
        Implements IDisposable

        ReadOnly _source As IEnumerable

        Sub New(source As IEnumerable)
            _source = source
            Reset()
        End Sub

        ''' <summary>
        ''' Gets the current element in the collection.
        ''' </summary>
        ''' <returns></returns>
        Public ReadOnly Property Current As Object Implements IEnumerator.Current
        Public ReadOnly Property ReadDone As Boolean

        Dim receiveDone As New ManualResetEvent(False)

        Private Sub __moveNext()
            _ReadDone = False

            ' Single thread safely
            For Each x As Object In _source ' 单线程安全
                _Current = x

                Call receiveDone.WaitOne()
                Call receiveDone.Reset()
            Next

            _ReadDone = True
        End Sub

        ''' <summary>
        ''' Returns current and then automatically move to next position
        ''' </summary>
        ''' <returns></returns>
        Public Function Read() As Object
            Dim x As Object = Current
            Call MoveNext()
            Return x
        End Function

        Dim _forEach As Thread

        ''' <summary>
        ''' Sets the enumerator to its initial position, which is before the first element in the collection.
        ''' </summary>
        Public Sub Reset() Implements IEnumerator.Reset
            If Not _forEach Is Nothing Then  ' 终止这条线程然后再新建
                Call _forEach.Abort()
            End If

            _forEach = New Thread(AddressOf __moveNext)
            _forEach.Start()
        End Sub

        ''' <summary>
        ''' Advances the enumerator to the next element of the collection.
        ''' </summary>
        ''' <returns>
        ''' true if the enumerator was successfully advanced to the next element; false if the enumerator has passed the end of the collection.
        ''' </returns>
        Public Function MoveNext() As Boolean Implements IEnumerator.MoveNext
            Call receiveDone.Set()
            Return Not ReadDone
        End Function
   End Class
End Namespace

Due to the reason of the remote linq is usually using for processing the large size data set, so that one element execution may takes time, the single thread move next function in this iterator using the for each statement to gets the element in the remote function 

Linq source provider works as a remote iterator

Image 6

Remote Linq Provider

The LinqProvider object work as a remote iterator in this framework, there are two important parts in this class object for implements the IEnumerator interface:

A move next function for gets the current value in the iterator and then move to next location automatically, a Reset function to resets the iterator state in the remote

VB.NET
<Protocol(TaskProtocols.MoveNext)>
Private Function __moveNext(CA As Long, args As RequestStream, remote As System.Net.IPEndPoint) As RequestStream
     Dim value As Object = _source.Current
     Dim readEnds As Boolean = _source.MoveNext()
     Dim json As String = Serialization.GetJson(value, _type)
     Dim flag As Long = If(Not readEnds, Protocols.TaskProtocols.ReadsDone, HTTP_RFC.RFC_OK)
     Return New RequestStream(flag, flag, json)
End Function

<Protocol(TaskProtocols.Reset)>
Private Function __reset(CA As Long, args As RequestStream, remote As System.Net.IPEndPoint) As RequestStream
     Call _source.Reset()
     Return NetResponse.RFC_OK
End Function

Linq local reader

VB.NET
''' <summary>
''' Remote LINQ source reader
''' </summary>
''' <typeparam name="T"></typeparam>
Public Class ILinq(Of T) : Implements IEnumerable(Of T)
    Implements IDisposable

    ''' <summary>
    ''' Element type in the source collection.
    ''' </summary>
    ''' <returns></returns>
    Public ReadOnly Property Type As Type = GetType(T)
    ''' <summary>
    ''' Remote entry point
    ''' </summary>
    ''' <returns></returns>
    Public ReadOnly Property Portal As IPEndPoint

    ReadOnly invoke As AsynInvoke
    ReadOnly req As New RequestStream(Protocols.ProtocolEntry, TaskProtocols.MoveNext)

    ''' <summary>
    ''' Creates a linq source reader from the remote entry point
    ''' </summary>
    ''' <param name="portal"></param>
    Sub New(portal As IPEndPoint)
        Me.Portal = portal
        Me.invoke = New AsynInvoke(portal)
    End Sub

    Public Overrides Function ToString() As String
        Return $"{Type.FullName}@{Portal.ToString}"
    End Function

#Region "Implements IEnumerable(Of T)"
     Public Iterator Function AsQuerable() As IEnumerator(Of T) Implements IEnumerable(Of T).GetEnumerator
        Call invoke.SendMessage(Protocols.LinqReset)  ' resets the remote linq source read position

        Do While True
            Dim rep As RequestStream = invoke.SendMessage(req)
            Dim json As String = rep.GetUTF8String
            Dim value As Object = Serialization.LoadObject(json, Type)
            Dim x As T = DirectCast(value, T)

            If rep.ProtocolCategory = TaskProtocols.ReadsDone Then
                Exit Do
            Else
                Yield x
            End If
        Loop
    End Function

    Private Iterator Function IEnumerable_GetEnumerator() As IEnumerator Implements IEnumerable.GetEnumerator
        Yield AsQuerable()
    End Function

#End Region

    ''' <summary>
    ''' Automatically free the remote resource.(释放远程主机上面的资源)
    ''' </summary>
    Private Sub __free()
        Dim uid As String = Portal.ToString
        Dim req As New RequestStream(ProtocolEntry, TaskProtocols.Free, uid)
        Call invoke.SendMessage(req)
    End Sub

Additional: Network data transfer in VisualBasic

Object Json Serialization

Object Serialize as Json string

VB.NET
Imports Microsoft.VisualBasic.Serialization

Dim json As String = <Your Object>.GetJson

Deserialize the json string as .NET object

VB.NET
Dim params As InvokeInfo = 
    Microsoft.VisualBasic.Serialization.LoadObject(Of InvokeInfo)(args.GetUTF8String)

An alternative solution of the Json serialization in the .NET programming is NewtonSoft Json library.

SendMessage

VB.NET
' Remote machine network portal
Dim remote As IPEndPoint

' Protocol request
Dim req As RequestStream = New RequestStream(ProtocolEntry, TaskProtocols.Invoke, ProtocolParameter)

' Send Protocol to invoke on the remote machine
Dim rep As RequestStream = New AsynInvoke(remote).SendMessage(req)

' Gets the remote returns json value
Dim rtvl As Rtvl = Serialization.LoadObject(Of Rtvl)(rep.GetUTF8String)

Protocol Handler

On the remote server machine

VB.NET
Imports Microsoft.VisualBasic.Net
Imports Microsoft.VisualBasic.Net.Protocol
Imports Microsoft.VisualBasic.Net.Protocol.Reflection

Dim Responsehandler = AddressOf New ProtocolHandler(Me).HandleRequest

The Microsoft.VisualBasic.Net.Protocol.Reflection.Protocol attribute is required of building a protocols flexible and high performance multiple threading server side program.

Services Socket

VB.NET
Dim Server As New TcpSynchronizationServicesSocket(listenPort)

' Binding the protocol to your server socket
Server.Responsehandler = AddressOf New ProtocolHandler(Me).HandleRequest

' Then the current thread will be bloked at Server.Run statement
' until the socket is disposed at other place
Call Server.Run

' Or using the thread to run the socket
Call Microsoft.VisualBasic.Parallel.Run(AddressOf Server.Run)

Points of Interest

Build Your Own Cloud Computing platform

Image 7

Base on the Parallel library in this article, then you can try build your own cloud computing platform like Microsoft Azure, you can programming to fully access the remote server resource just like running on the local machine.

One example of the mobile cloud computing is the Microsoft bioinformatics scientific project .NET Bio(Previous used name on codeplex is Microsoft Bioinformatics Foundation, MBF), a mobile library was developed that the developer can integrated their WindowsPhone/Android/IOS App with the NCBI cloud computing BLAST services to analysis the biological sequence data size up to 10GB or 100GB.

By using the cloud computing, this can integrated your server machine resource into the Windows Phone or UWP program directly, then this makes using your Windows Phone processing very large data set possible.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead PANOMIX
China China
He is good and loves VisualBasic! Senior data scientist at PANOMIX


github: https://github.com/xieguigang

Comments and Discussions

 
-- There are no messages in this forum --