Map Reduce: How can partitioning two data-sets records and how can get these blocks to make them pairs

Question

0.00/5 (No votes)

See more:

I want to create a Map function with the following operations:

Step 1:

I have two data sets R and S. I want to partition the two data sets into n equal-sized blocks which can be done by putting every (R/n and S/n )records into one block.

After that:

Step 2: Then every possible pair of blocks (one from R and one from S) is partitioned into a bucket at the end of Map phase so that can be taken from the Reduce Function as input with some id as key for each value pairs. e.g will be

Java

<id:(Sij,Ril)>

So my questions are:

1)Is there any implemented function that I can use for step 1? How implement this operation separately for each data-set.

2)How can I refer specifically to these data sets in step 2 so that I can take one block from R and one from S?

Note: In main I define the two data sets like this :

Java

FileInputFormat.setInputPaths(conf, new Path(args[0]), new Path(args[1]));

Posted 4-Apr-14 1:27am

User3490

Add a Solution

Comments

TorstenH. 4-Apr-14 7:37am

Are you talinkg about a geographical map[^] or are you talking about the datatype Map[^] ?

User3490 4-Apr-14 7:44am

Actually I talk about MapReduce https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v1.0
I try to implement something like this.

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)