Bannalia: trivial notes on themes diverse: 2014

Saturday, May 31, 2014

Indiscernible properties

Max Black's argument against the principle of identity of indiscernibles (PII for short) contends that one could conceive of a plurality of objects with the exact same observable properties (for some notion of "observable" and "property"), and proposes the simple example of a universe exclusively composed of two exactly similar spheres some distance apart from each other: in this perfectly symmetrical setup there is no way to tell one sphere from the other except by arbitrarily picking one up, which selection can not be based on their observable properties; so, the spheres are different (there are two of them) but all their salient properties are the same, thus breaking PII.

Let us simplify the two-sphere example even further down to a universe with two points without any intrinsic physical property except their being at some distance of each other. In the 2P universe, formulated as a theory in second-order logic, there are no primitive properties (unary relations) to discuss about and only one primitive binary relation C defined as

C(x, y) := the distance from x to y is zero

(C stands for"colocated") satisfying the following axioms:

∀x C(x, x) (each point is colocated with itself),
∀x∃y ¬C(x, y), (for each point there is another one not colocated with it).

Now, if we accept the axiom scheme of comprehension, we have

∀x∃R∀y(Ry ↔ C(x, y)),

which, in combination with the axioms for C, implies

∀x∃R∃y(Rx ∧ ¬Ry),

that is, for each point x there is a property (namely that of being colocated with x) that x has and some other point y (which, in 2P, is tantamount to saying any other point) does not have. This seems to restitute PII.

To this reasoning Black could have retorted in (at least) two different ways:

"Being colocated with x" is just a slightly disguised form of "being identical with x", which is trivially true of x itself and trivially false of any other entity, adding nothing to our initial assumption that 2P has a population of two.
The reasoning involves two ill-defined properties, "being colocated with a" and "being colocated with b", where a and b are one point and the other, or the other way around: without any discernible feature to use, there is no way we can select a particular point out of the two we have in 2P.

The first objection we can easily dispose of: "being colocated with x" is a bound version (via comprehension) of C, which is a perfectly discernible, physical property of pairs of points —if I'm given two points, identical or not, I can certainly inspect whether they are colocated. The second counterargument is more interesting and, in my opinion, allows us to provide a clearer formulation of Black's thesis. The objection could be rephrased as: for the same reasons that naming the two points "a" and "b" requires that we can previously tell one point from the other, their univocally associated properties R_a and R_b are also indiscernible, even if they are different. We can formalize indiscernibility in the following way:

Definition. A second-order theory T (with or without equality) is said to have indiscernible entities if there exists a formula φ with two free variables such that

∃x∃y (φ(x, x) ∧ φ(y, y) ∧ ¬φ(x, y)) (1)

is a theorem of T and, for each model M of T and a, b ∈ M satisfying (1), the structure M' obtained by swapping a and b in all the relations of M is also a model of T.

If the set of primitive relations of T is finite, indiscernibility can be expressed as a statement in T itself (sketch of proof: augment (1) with a conjunction of terms expressing swappability of x and y with respect to each position of each primitive relation of T). If T has equality, the statement

∃x∃y (φ(x, x) ∧ φ(y, y) ∧ ¬φ(x, y) ∧ x ≠ y) (2)

is a theorem of T (proof trivial). 2P has indiscernible entities (proof trivial).

In conclusion, a formal interpretation of PII resists Black's attacks, but the following, stricter version of the principle:

If two entities have the same discernible properties, they are identical,

is false, and probably what Black had originally in mind.

Friday, May 16, 2014

The perfect shape

A liquid container designed to keep its content cool as long as possible would be spherical in shape, since the sphere is the closed surface that has the minimum area to enclosed volume ratio, thus minimizing convection heating. When the container is also meant to drink from, the sphere is no longer the optimum shape for coolness (leaving aside the fact that one should have to pierce it to access the liquid) because the surface of the remaining liquid changes as the container is being emptied.

A volume V of liquid at temperature T inside a glass is heated by convection on two different surfaces, one in direct contact with the environment and the other as enclosed by the glass wall.

These two surfaces contribute to the increase in temperature, given by Newton's law of cooling (here heating):

dT = (1/c_vV)(h_open A_open + h_closed A_closed)(T_env - T) dt,

where c_v is the volumetric heat capacity of the liquid, h_open and h_closed the overall heat transfer coefficients of the open and closed interfaces, respectively, and T_env the temperature of the environment. If the liquid is drunk at a constant rate of ϕ units of volume per unit of time, we have

dT = (1/c_vϕV)(h_open A_open + h_closed A_closed)(T - T_env) dV,

with A_open and A_closed varying as functions of V; this differential equation implicitly gives T as a function of (A_open, A_closed, V), although its analytical solution is only feasible for the simplest shapes. We define the optimum container as that for which the average temperature of the liquid during consumption

T_avg = ∫_{[V_init,0]}TdV

is minimum.

Let us calculate this shape numerically. Candidate glasses are restricted to surfaces of revolution generated by nonnegative functions y = f(x) between 0 and some maximum height h.

We fix the rest of parameters: the contained liquid is one liter of water (or beer, for that matter) brought out of the fridge at 5 ºC and drunk in 3 minutes in a very hot summer afternoon at 40 ºC; the glass wall (made of, well, glass) is set to be 3 mm thick. All of this gives us:

V_init = 1,000 cm³,
ϕ = 1,000/180 cm³/s,
T_init = 5 ºC,
T_env = 40 ºC,
c_v = 4.1796 J/cm³/K,
h_water = 50 W/m²/K,
h_air = 100 W/m²/K,
θ_glass = 3 mm,
k_glass = 0.85 W/m/K,
h_open = 1/((1/h_water)+(1/h_air)) = 33.3 W/m²/K,
h_closed = 1/((1/h_water)+(θ_glass/k_glass)+(1/h_air)) = 29.8 W/m²/K.

We will use a genetic algorithm to solve the problem:

Candidate solutions are codified as arrays with the values f(0), f(0.5 cm), ... , f(50 cm) (that is, the maximum height of the glass is h = 50 cm, much higher than really needed —excess height will simply manifest as a tiny column of radius ≈ 0).
The initial pool of candidates is populated with values from random Hermite interpolation polynomials of degree 3 normalized so that the enclosed volume is V_init.
Individual fitness is simply its T_avg calculated figure: the lower T_avg the fitter the solution.
At each generation, the fittest 25% of the pool is kept and the rest replaced by random breeding of the former. Crossover of f₀ and f₁ is implemented by randomly selecting two cutoff points c₁ and c₂, putting f as:
- f(x) = f₀(x) for x < c₁,
- f(x) = f₀(x)·(c₂- x)/(c₂- c₁) + f₁(x)·(x - c₁)/(c₂- c₁) for c₁ ≤ x < c₂,
- f(x) = f₁(x) for x ≥ c₂,
and normalizing f.
Bred individuals are mutated (before normalization) by multiplying some of their values (mutation rate = 1%) by a random factor in the range [0,2).

The calculation program has been implemented in C++ (Boost, and in particular Boost.Units, is used). Trials with different initial populations and algorithm parameters (mutation rates, pool sizes and so on) yield basically the same result, which is depicted in the figure:

Optimum glass profile.

The area of this glass is 537 cm² and its associated average temp T_avg is 6.40 ºC (quite cool!). As predicted, the excess height from ~25 cm up shows as a tiny filament whose actual capacity is totally negligible: trimming the glass cap off at a height of 18.5 cm produces a more practical glass with an opening diameter of 4.2 cm and almost the same performance (we only dropped 2% of the initial content and the temperature of the liquid at that point, 3.6 seconds after beginning to drink, is only slightly higher than T_init). The glass thus obtained is, perhaps surprisingly, very reminiscent of real-life glasses:

Optimum glass trimmed at 18.5 cm.

In general, trimming the optimum glass f(x) for a given T_init at height h_cut produces the optimum solution of the problem for V_init = V(height = h_cut), T_init = T(height = h_cut) with the additional constraint that the opening of the glass is precisely 2f(h_cut): we can then play with this property to obtain solutions to the modified problem

minimize T_avg while maintaining a given opening diameter δ

by starting with V_init+ ΔV and T_init - ΔT and adjusting ΔV and ΔT until the volume and temperature at the desired opening hit V_init and T_init simultaneously (experimenting with this is left as an exercise for the reader).

Parameter sensitivity

An informal analysis reveals that the optimum glass shape is quite independent of changes in most problem parameters. In particular:

T_initdoes not affect significantly the solution (as long as T_init < T_env).
Changing V_init simply produces a scaled version of the same shape.
Decreasing ϕ (that is, taking longer to drink the beer up) yields only slightly different shapes, with a wider lower segment.

The one aspect that truly changes the shape is the h_open/h_closed ratio. As this increases, the optimum glass gets thinner at the base and longer, approaching a cylinder as h_open/h_closed → ∞. This has practical implications: for instance, making the beer grow a thick foam topping reduces h_open, which allows for shorter, wider, rounder glasses.

Postscript

Some readers have asked why we haven't used calculus of variations to solve the problem. In its classical one-dimensional formulation (Euler-Lagrange equation), this calculus provides tools for finding the stationary values of a functional J(f) defined as

J(f) = ∫_[a,b] L(f(x), f'(x), x) dx,

for some function L : ℝ³ → ℝ with appropriate smoothness properties. The important aspect to note here is that L must depend on the point values of f and f' at x alone: by contrast, in our case the integrand T(V) is a function of h_open A_open(v) + h_closed A_closed(v) for all values of v in [V, V_init], so the conditions for the application of Euler-Lagrange equation do not hold.

Wednesday, May 7, 2014

Fast polymorphic collections with devirtualization

The poly_collection template class we implemented for fast handling of polymorphic objects can be potentially speeded up in the case where some of the derived classes it manages are known to the compiler at the point of template instantiation. Let us use this auxiliary template class:

template<class Derived>
class poly_collection_static_segment
{
public:
  void insert(const Derived& x)
  {
    store.push_back(x);
  }
  
  template<typename F>
  void for_each(F& f)
  {
    std::for_each(store.begin(),store.end(),f);
  }
  
  template<typename F>
  void for_each(F& f)const
  {
    std::for_each(store.begin(),store.end(),f);
  }

private:
  std::vector<Derived> store;
};

We can add the ability for poly_collection to accept a variable number of derived class types:

template<class Base,class ...Derivedn>
class poly_collection
{
  ...
};

so that it includes poly_collection_static_segment<Derivedi> components taking care of objects whose types concide exactly with those specified, whereas for the rest the default dynamic handler is still used.

What does this gain us? Consider the example:

class base
{
  virtual void do_stuff()=0;
};

class derived1:public base{...};
...
poly_collection<base,derived1> c;
c.insert(derived1(0));
...
c.for_each([](base& x){
  x.do_stuff();
});

For the segment handling derived1, the code in for_each resolves to:

std::for_each(store.begin(),store.end(),[](base& x){
  x.do_stuff();
});

where store is of type std::vector<derived1>: so, the compiler can know that x.do_stuff() is invoked on objects with exact type derived1 and hence omit vtable lookup and even, if the definition of derived1::do_stuff is available at instantiation time, inline the call itself.

Now, the fact that the compiler can apply these optimizations does not mean it necessarily will do so: the static analysis needed to determine the optimization opportunity is certainly not trivial. There are a number of techniques we can implement to help the compiler in the process:

Make derived1::do_stuff final.
Use a polymorphic lambda [](auto& x){x.do_stuff();} to omit the cast from derived1& to base&.
Do both 1 and 2.

I have written a test program (Boost required) that implements this new version of poly_collection and measures its for_each performance for the following scenarios:

⬛ poly_collection<base> (baseline)
⬛ poly_collection<base,derived1>
⬛ poly_collection<base,derived1,derived2>
⬛ poly_collection<base,derived1,derived2,derived3>
⬛ poly_collection<base,final_derived1>
⬛ poly_collection<base,final_derived1,final_derived2>
⬛ poly_collection<
base,final_derived1,final_derived2,final_derived3>
⬛ same as 1, with a polymorphic functor
⬛ same as 2, with a polymorphic functor
⬛ same as 3, with a polymorphic functor
⬛ same as 4, with a polymorphic functor
⬛ same as 5, with a polymorphic functor
⬛ same as 6, with a polymorphic functor
⬛ same as 7, with a polymorphic functor

with containers of several sizes ranging from n = 1,000 to 10⁷ elements (except where noted, results do not vary significantly across sizes). Values are the averages of execution times / number of elements for each scenario, normalized to the baseline = 1.

MSVC 2012

Microsoft Visual Studio 2012 using default release mode settings on a Windows box with an Intel Core i5-2520M CPU @2.50GHz.