Function Extensions Techniques 2: Tuples

JIANGWilliam

Rate me:

5.00/5 (2 votes)

9 May 2018CPOL15 min read

9.1K

This is the second of three episodes to introduce considerations and techniques applied in Tuple classes in my open-sourced library functionExtensions.

functionExtentions is a Java library with Throwable Functional Interfaces, Tuples and Repositories implemented to expedite Functional Programming with JAVA 8. It is released on Maven, with following goals accomplished:

Declares a rich set of functional interfaces throwing Exceptions, that can be converted to conventional ones with Checked Exceptions handled with shared exceptionHandlers, thus allow developers to define the concerned business logic only within the lambda expressions.
Implements an immutable data structure to keep and retrieve up to first 20 strong-typed values as a set of Tuple classes.
Provides Repositories as a kind of Map-based intelligent utility with pre-defined business logic to evaluate a given key or keys (upto 7 strong-typed values as a single Tuple key) to get corresponding value or values (upto 7 strong-typed values as a single Tuple value), buffer and return them if no Exception happened.
Multiple powerful generic utilities to support above 3 types utilities, mainly build with Repositories, that support various combinations of primitive and objective types and arrays. For example:
- Object getNewArray(Class clazz, int length): new an array of ANY type with element type and length of the newed instance.
- String deepToString(Object obj): Returns a string representation of the "deep contents" of the specified array.
- Object copyOfRange(Object array, int from, int to): Copy all or part of the array as a new array of the same type, no matter if the array is composed by primitive values or not.
- T convert(Object obj, Class<T> toClass): convert the object to any equivalent or assignable types.
- boolean valueEquals(Object obj1, Object obj2): comparing any two objects. If both are arrays, comparing them by treating primitive values equal to their wrappers, null and empty array elements with predefined default strategies.

This serial of posts include:

Introduction

Working with Object Oriented Programming language like JAVA and C#, when dealing with a set of correlated data, the common practice is to define classes as containers of various strong-typed fields plus related methods. Although it is possible to declare anonymous classes in JAVA 8, some time, especially when different combinations of various types of data are concerned, using classes/structs everywhere looks like a overkill for me.

In this post, I would like to introduce how easy it is to implement generic Tuple classes to encapsulate correlated data together, as well as potential use of them to apply Functional Programming paradigm to simplify coding process and enable more advanced features to be discussed in the following posts.

Background

The idea and basic implementation of my Tuple classes in JAVA comes from Tuple class in C#, as the Microsoft post explained, Tuples are commonly used in four ways:

To represent a single set of data. For example, a tuple can represent a database record, and its components can represent individual fields of the record.
To provide easy access to, and manipulation of, a data set.
To return multiple values from a method without using out parameters (in C#) or ByRef parameters (in Visual Basic).
To pass multiple values to a method through a single parameter. For example, the Thread.Start(Object) method has a single parameter that lets you supply one value to the method that the thread executes at startup time. If you supply a Tuple<T1, T2, T3> object as the method argument, you can supply the thread’s startup routine with three items of data.

A good summary of C# Tuple is here.

Although Tuple is not widely used in .NET, due to lack of killer apps based on my observations, there are at least three attractive benefits to be exploited:

The immutable nature: that is a quite critical for Functional Programming.
Compact structure to carry dynamic dataset of strong typed values that in some cases could be used to replace POJOs.
A bundle of values of varied length could be treated as object of same type (Tuple).

Consequently, this simple generic class is used to work out the generic Repository<> classes and Railway Oriented Programming utilities.

Implementations

Under the hood, the Tuple classes use final Object[] values to keep the elements composed of this data structure, that means the primitive values (int, float...) would be boxed to corresponding wrappers (Integer, Float...).

The Strong-typed read-only accessors to these elements are actually defined in a number of WithValuesXx interfaces as their default methods. The comparability, that is critical when using Tuples as the keys of a map, is backed by a set of powerful generic methods.

Immutability

Once the Tuple instance is created, there is no means to change its elements by adding/deleting since the array of Object is final. Though there is no means to prevent changing the content of these elements (For example, delete or add new element to a List when the list is one of the element of the Tuple), that is misuse of this data structure and not in scope of consideration.

With the assumption that no elements would change once the Tuple is constructed, following override methods would be backed by private variables to avoid being evaluated unnecessarily:

@Override int hashCode();
@Override String toString();
int[][] getDeepLength(): used to enable boolean equals(Object obj), to be discussed in following section.

Strong-typed Accessors

Once multiple values of different types are used to construct a Tuple instance, retriving them as their original type is a challenge. Instead of using the single Object[] as this library, an alternative approach is to define multiple private final fields of the identical value types, for instance, a Tuple3<T,U,V> could be defined as:

public class Tuple3<T,U,V> extends Tuple {
    private final T t; //public T getFirst(){ return t; }
    private final U u; //public U getSecond(){ return u; }
    private final V v; //public V getThird(){ return u; }
}

Such implementation, however, makes extension and maintain difficult and low-efficient. A Tuple class composed of N elements of different types need N variables to keep them and N methods to access accordingly, espcially when these variables and methods could be shared by Tuple classes composed by N+1 elements.

Though it is possible to enable the sharing of variables and methods by class inheritence, a new technique is applied in this library with key points as below:

Using JAVA Generic Interfaces to retain type info of the elements composing a Tuple.
Default Methods of Interface since Java 8, with its counterpart of Extension Methods in C#, are used to define the strong-typed element accessors with type info retained by the Generic Interfaces.
Instead of Class Inheritance, a chain of Generic Interfaces are defined by extending one by one:
- To inherit the default methods from the super interfaces. For instance, WithValues3<T,U,V> would inherit T getFirst() and U getSecond() methods of WithValues2<T,U>.
- Define their own accessors that could be inherited by their sub-interfaces. For example, V getThird() is defined by WithValues3<T,U,V>.
The TupleN class with N strong-typed elements, as sub-classes of Tuple and extends WithValueN interface, only need to declare its constructors when the N element accessors are inherited from the WithValueN interface directly.

The relevent generic interfaces are declared as below:

public interface WithValues {
    Object getValueAt(int index);
}
public interface WithValues1<T> extends WithValues {
    default T getFirst() {
        return (T)getValueAt(0);
    }
}
public interface WithValues2<T,U> extends WithValues1<T> {
    default U getSecond() {
        return (U)getValueAt(1);
    }
} ...
public interface WithValues20<T,U,V,W,X,Y,Z,A,B,C,D,E,F,G,H,I,J,K,L,M> extends WithValues19<T,U,V,W,X,Y,Z,A,B,C,D,E,F,G,H,I,J,K,L> {
    default M getTwentieth() {
        return (M)getValueAt(19);
    }
}

There are 20 element accessors (T getFirst(), ..., M getTwentieth()) are defined once only, but shared by all inherited interfaces that have more elements than the one declaring them directly. The only abstract method Object getValueAt(int index) is implemented by Tuple that is super class of all other classes in this module as below:

public class Tuple implements AutoCloseable, Comparable<Tuple>, WithValues {
    @Override
    public Object getValueAt(int index) {
        if(index < 0 || index >= values.length)
            return null;
        return values[index];
    }

Its sub classes share a very succinct structure as Tuple3<T,U,V> does:

public class Tuple3<T,U,V> extends Tuple
    implements WithValues3<T,U,V> {

    protected Tuple3(T t, U u, V v){
        super(t, u, v);
    }
}

Even the TuplePlus<T,U,V...L,M> class that is composed by more than 20 elements as defined below, can still access its first 20 elements with the shared strong-typed accesors:

public class TuplePlus<T,U,V,W,X,Y,Z,A,B,C,D,E,F,G,H,I,J,K,L,M> extends Tuple
        implements WithValues20<T,U,V,W,X,Y,Z,A,B,C,D,E,F,G,H,I,J,K,L,M> {
    protected TuplePlus(T t, U u, V v, W w, X x, Y y, Z z, A a, B b, C c, D d, E e, F f, G g, H h, I i, J j, K k, L l, M m, Object... more){
        super(ArrayHelper.mergeTypedArray(new Object[]{t,u,v,w,x,y,z,a,b,c,d,e,f,g,h,i,j,k,l,m}, more));
    }
}

In this way, all elements of a Tuple is stored in a single Object array and can be retrieved as they are with the shared accessors defined once only in the generic interfaces.

Even when there is a casting of Object to the actual element type, there are some benefits overwhelming its cost:

No duplicate codes to access any elements.
Easy to maintain and extend. (That is why I rename Tuples from Single/Double to Tuple1/Tuple2)
Easy to test: a full test of TuplePlus means all other TupleN classes have been validated.

However, this could be some issues when running "mvn javadoc:javadoc" before publishing the library to Maven: it took as much as 5.6G memory! I have reported this as a bug to Oracle, but there is no performance hit observed with such implementations when compiling or running the tests.

Comparability

Not only as a data structure to keep multiple values, Tuple classes are also designed as a comparable<Tuple> object to compare with another set of values disregarding if one value is primitive or object before it can be used reliably as Keys of a map.

To achieve this target, several generic methods are developed in this library to enable advanced comparison of any objects or arrays; though they can be used in more generic scenarios, the details are discussed here instead of the next Episode as indispensable features of the Tuple:

(in TypeHelper) int deepHashCode(Object obj) to get hash code of an object when the obj is:
- null or single object: returns the same hash code as int Objects.hashCode(Object o);
- array of objects: returns the same hash code as int Arrays.deepHashCode(Object a[]);
- single primitive value: would be casted to its wrapper object before evaluation.
- arrays of primitive values: would be converted to corresponding wapper arrays before being evaluated. For instance, int[] would be converted to Integer[], char[][] would be converted to Character[][] first before calling TypeHelper.deepHashCode() with the converted arrays. The conversion is backed by another generic utility Object toEquivalent(Object obj) in TypeHelper that would be discussed in the third episode.
(in TypeHelper) String deepToString(Object obj) to get string representation of the concerned object no matter if it is a single one or an array of primitive values or objects. This method is used to evaluate String toString() of the Tuples for the first time, and the result would be saved for later use with the immutability assumption of the Tuples. The int compareTo(Tuple o) of Comparable<Tuple> interface is also based on the String representation of the Tuples.
(in TypeHelper) int[][] getDeepLength(Object obj) means to get a snapshot of an object by assuming it as an array of arrays to enable valueEquals() even parallelly:
- 3 types of nodes are classified for later evaluation: NORMAL_VALUE_NODE (0), NULL_NODE(-1) or EMPTY_ARRAY_NODE(-2).
- The indexes to locate the above 3 kinds of nodes, combined with their type identifiers are composed as an int[] to capture how to get that node and what kind of node they are.
- All captured int[] are grouped as the result int[][].
- For example, Object target = new Object[]{1, new int[]{2,3}, new Object[]{null, '5', '6', null}, new char[0], 110} would get deepLength of [[0, 0], [1, 0, 0], [1, 1, 0], [2, 0, -1], [2, 1, 0], [2, 2, 0], [2, 3, -1], [3, -2], [4, 0]]:
  - The 3rd int[] "[1,1,0]" describe "3" that is the second element of the int[]{2,3} that is second element of the target, and 0 means it is a NORMAL_VALUE_NODE
  - The 8th int[] "[3, -2]" denotes the new char[0] as the 4th elements of the target, and it is an empty array.
(in TypeHelper) boolean valueEquals(Object obj1, Object obj2) and other versions with optional strategies or running mode (parallel or serial) to evaluate if two objects are equal:
- If one of them are not array, returns the result as boolean Objects.equals(Object a, Object b);
- Similar with boolean Arrays.equals(Object[] a1, Object[] a2), but:
  - Treating arrays of primitive values same as their wrapper arrays: char == Character, int[] == Integer[] and boolean[][] == Boolean[][].
  - The evalution happens by first compare the deepLength of the compared Tuple instances, that is buffered and shows the structure of the elements composing the Tuple. only the length of the two int[][] are same and exactly matched, the method would evaluate every nodes by their actual values with the given or default evaluation strategies.
  - Since each node can be accessed with their own indexes, the evaluation can happen parallelly.
  - Null nodes and Empty Array nodes each can be evaluated with 3 strategies: TypeIgnored, BetweenAssignableTypes and SameTypeOnly. Thus a new int[0] is equals to a new Integer[0] when TypeIgnored or BetweenAssignableTypes, but not so when SameTypeOnly.

Treating primitive values same as their wrapper objects is the something differentiate the comparison of two arrays from boolean Arrays.equals(Object[] a1, Object[] a2) that cannot conclude new int[]{1,2,3} equals with new Integer[]{1,2,3}.

To be more specifically, when comparing following two objects:

obj1: new Object[]{new int[]{3,2,1}, new short[0], 1.1d, null, new String[]{"S1", "S2", null}, new Integer[]{null, 23}}
obj2: new Object[]{new Integer[]{3,2,1}, new Short[0], Double.valueOf(1.1), null, new Comparable[]{"S1", "S2", null}, new Number[]{null, 23}}

Then the evaulation results could be:

By default, when NullEquality and EmptyArrayEquality are both TypeIgnored, then the nulls of obj1 (4th element of the Object[], 3rd elements of the String[] and 1st of Integer[]) would be treated as equal to the nulls of obj2 (4th element of the Object[], 3rd elements of the Comparable[] and 1st of Number[]), and the only empty array of obj1 (new short[0]) is also regarded as equal with the only empty array of obj2 (new Short[0]). Thus they are equal by calling valueEquals(obj1, obj2).
While if NullEquality.SameTypeOnly is chosen, then the nulls of obj1 (4th element of the Object[], 3rd elements of the String[] and 1st of Integer[]) would NOT be treated as equal to the nulls of obj2 (4th element of the Object[], 3rd elements of the Comparable[] and 1st of Number[]), thus calling valueEquals(obj1, obj2, SameTypeOnly, TypeIgnored) would return false.

AutoCloseable

The Tuple class is declared as AutoCloseable with following close() method:

@Override
public void close() throws Exception {
    if(!closed) {
        //Close AutoCloseable object in reverse order
        for (int i = values.length - 1; i >= 0; i--) {
            Object value = values[i];
            if (value != null && value instanceof AutoCloseable) {
                Functions.Default.run(() -> ((AutoCloseable) value).close());
                Logger.L("%s closed()", value);
            }
        }
        closed = true;
        Logger.L("%s.close() run successfully!", this);
    }
}

So if there is a Tuple composed by several elements one depends on another, for example, with following pseudo codes:

public Tuple<File, Stream, ExcelFile, ExcelSheet[]> getExcel(String filename){
    File file = new File(filename);
    Stream stream = new FileInputStream(file);
    ExcelFile excel = new ExcelFile(stream);
    ExcelSheet[] shees = reader.getSheets();
    return new Tuple(file, stream, reader, excel, sheets);
}
Tuple<File, Stream, ExcelFile, ExcelSheet[]> excelTuple = getExcel("somefile.xlsx");
... handling of the Excel sheets
excelTuple.close();

Suppose the elements of above Tuple instance are all AutoCloseable, then closing the excelTuple instance would release all involved resources in right order. With factory methods like above, try-with-resources statement could be simplified by caring the Tuple as a group of resources.

Filtering as Set

A special type of Tuple - Set is defined to keep elements of the same type, which keep their type information and use it during the comparisons before evaluating the element values with the comparator of the Tuple class.

Since the elements of Set are of the same type, thus the Set can convert the stored elements as strong typed array:

public T[] asArray(){
    return Arrays.copyOf((T[]) values, values.length);
}

There are two methods of the Tuple can be used to extract certain kinds of its elements as a Tuple:

Set<T> getSetOf(Class<T> clazz): Get all Non-null elements matching the given class as an immutable Set.
Set<T> getSetOf(Class<T> clazz, Predicate<T> valuePredicate): Get all Non-null elements of the given class and matched with predefined criteria as an immutable Set.

The latter is implemented as below:

public <T> Set<T> getSetOf(Class<T> clazz, Predicate<T> valuePredicate){
    Objects.requireNonNull(clazz);
    Objects.requireNonNull(valuePredicate);
    List<T> matched = new ArrayList<>();

    Predicate<Class> classPredicate = TypeHelper.getClassEqualitor(clazz);

    int length = getLength();
    for (int i = 0; i < length; i++) {
        Object v = values[i];
        if(v != null){
            try {
                if(classPredicate.test(v.getClass()) && valuePredicate.test((T)v))
                    matched.add((T)v);
            }catch (Exception ex){}
        }
    };
    T[] array = (T[]) matched.stream().toArray();
    return setOf(clazz, array);
}

Then it is convenient to get a Set to keep all elements of the Tuple that matching some criteria:

Tuple manyValues = Tuple.of("abc", null, 33, true, "a", "", 'a', Tuple.TRUE, 47);
assertEquals(Tuple.setOf("abc", "a", ""), manyValues.getSetOf(String.class));
assertEquals(Tuple.setOf("abc"), manyValues.getSetOf(String.class, s->s.length()>2));

In addition, to create a Set directly, there are two static methods defined in Tuple:

Set<T> setOf(T... elements): create a set of elements whose types are deducted by JAVA.
Set<T> setOf(Class<T> elementType, T[] elements): create a set whose type can be identified directly.

Thus a Set<Integer> is created if all elements are Integers:

Set<Integer> integerSet = Tuple.setOf(1, 2, 3);
assertTrue(Arrays.deepEquals(new Integer[]{1,2,3}, integerSet.asArray()));

But it is also possible to specify a shared type as the element type:

Set<Comparable> comparableSet = Tuple.setOf(Comparable.class, new Comparable[]{1.0, 'a', "abc"});
assertTrue(Arrays.deepEquals(new Comparable[]{1.0, 'a', "abc"}, comparableSet.asArray()));

How to use

The project is hosted at github.com, to include the library, add following info to your maven project:

<dependency>

<groupId>io.github.cruisoring</groupId>

<artifactId>functionExtensions</artifactId>

<version>1.0.1</version>

</dependency>

Once the package is included, the constructors of Tuple classes are not accessible, there are two ways to create strong typed Tuples:

Calling static methods of Tuple.create(...) with variable length of arguments, based on the length of the argument list:
- When no arguments provided, the singleton Tuple.UNIT of Tuple0 is returned;
- When 1 - 20 arguments are provided, then constructors of Tuple1 - Tuple20 are called to create strong-typed Tuple1, Tuple2, ... Tuple20 instances.
- When more than 20 arguments are provided, then a TuplePlus instance providing strong-typed assess to the first 20 elements would be returned.
Alternatively, static method of Tuple Tuple.of(...) with variable length of arguments is just a wrapper of above create() methods. However, the type information of the elements composing the created Tuple is erased when it is returned as Tuple.

As a result, Tuple.create would create a strong-typed Tuple6 instance with all element type info persisted during the declaration:

Tuple6<Integer, Character, Double, String, DayOfWeek, Boolean> tuple2 = 
    Tuple.create(1, 'a', 3.0, "abc", DayOfWeek.MONDAY, true);

While calling Tuple.of() needs a cast, but that makes it possible to name the elements as a more generic kind of values:

Tuple6<Comparable, Object, Number, String, DayOfWeek, Boolean> tuple =
        (Tuple6<Comparable, Object, Number, String, DayOfWeek, Boolean>) Tuple.of(1, 'a', 3.0, "abc", DayOfWeek.MONDAY, true);

The first 3 elements of the Tuple instance are declared as Comparable, Object, Number, instead of Integer, Character, Double, that makes the same Tuple varable declared above to keep different instances with different kinds of elements later.

Anyway, once the Tuple instance was declared with the element type info, their elements could be retrieved as declared:

Comparable first = tuple.getFirst();
assertEquals(Integer.valueOf(1), first);
assertEquals(Character.valueOf('a'), tuple.getSecond());
assertEquals(Double.valueOf(3.0), tuple.getThird());

Conclusion

As a conclusion, the Tuple classes makes it easy to keep multiple values of different types in an immutable data structure, retrieve their values conveniently with the strong-typed accessors. These immutable values can be compared by their values with different strategies, serially or parallelly, closed automatically if any of them are AutoCloseable, and filtered to generate special collection of data Set for futher processing.

Although it is possible to use the Tuple classes alone as a generic data structure to keep and manage data of variable length and types, the meat of the matter is to use them as Keys or Values of JAVA Map with functions as first-class members to build powerful utilities with combination of buffer and business logic, that would be discussed in the incoming last episode of this serial:

functionExtensions Techniques 3: Repository

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

JIANGWilliam

Software Developer

Australia

This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.